Drupal + Google Search Appliance + Multiple languages and collections

2011-07-06 20:29:10 PST

Tags: , , ,

So at work we use Drupal a lot. And one of our clients has a Google Search Appliance (mini google in a box installed in their datacenter for their personal use) and runs sites that operate in multiple languages. So we installed the Drupal GSA module, flicked it on, and had Google searches. Of course since we had multiple languages, we set up GSA collections, one collection for each language. Easy enough. But by default the Drupal GSA module only takes one collection in it’s admin page. So we did what we always do in this situation, we made the collection a multilingual variable, a kind of Drupal hack that enables that variable to have different values transparently depending on the site’s language. Then we input each GSA collection into each language/site’s GSA config. Done and good to go? Nope. All the searches were still in English for all the sites. So on the French site we cleared the cache and then the searches were in French! … for all the sites. The Drupal GSA module was deeply not designed to work with multiple languages/collections it turned out.

After about 2 hours debugging we figured it all out. When the module is initialized, it pulls out all the config data, puts it in a hash, and hands it off to Drupal in the menu hook. Drupal stores it forever, and hands it off to the search function whenever it is called and so what ever language is active when the module is initialized determines what colleciton is used forever for all languages. Not cunning. The fix however was straight forward and one line: in the search function, just disregard the passed in settings value for the collection and re-pull it.

For what it’s worth I filed a bug and patch (against 6.x-2.0-beta1, though it should work fine for all versions) with the module at http://drupal.org/node/1208892 but it seems it’s suffering from low maintainership.

Here it is as well, though it’s probably just as easy to copy/paste the line in directly.

Index: google_appliance.module
===================================================================
--- google_appliance.module	(revision 14982)
+++ google_appliance.module	(working copy)
@@ -1055,6 +1055,9 @@
  *   Themed search results.
  */
 function google_appliance_search_view($search_base, $client, $collection, $keys = NULL, $title = NULL) {
+  
+  $collection =  trim(variable_get('google_appliance_default_collection', 'default_collection'));
+  
   $form = drupal_get_form('google_appliance_search_form', NULL, $search_base, $client, $collection, $keys);
   // When POSTing back to an existing search-results page, the original
   // URL is accessed (which re-runs that search) and then the redirect for

Now it works just fine and all our different language sites have searches and results in just their language. Splendid!

Job

2011-01-24 21:54:56 PST

Tags: , ,

I’ve finally finished school. It’s been a bit of a journey, about 6 and 1/2 years, but I did it. In a few months a Bachelors of Science, majoring in Computer Science will be given to me by UBC. I’ve had a good go. I took a lot of neat class on a lot of interesting subjects. I’m happy with my education and record. I had time to write some really cool code on my own, like the mindstab Go AI competition, learning Lisp, and Cortex (partly for school) to name a few. And in between all that I also have had the time to travel, to go to some really cool places: Mexico, Guatemala, China, South Korea, Hong Kong, and Colombia. Life has been lucky and good.

Now time for something new, a new phase. And so to kick that off, I’ve landed (luckily) a one month trial contract at a web company downtown, as a PHP (and other assorted opensource technologies) developer. This is an amazing opportunity and I hope it goes well. From the two days I’ve had with them so far I really like it and would be very happy there permanently. Either way, I’m now much busier than I’ve been in a while.

Gallery3 is Not Ready

2011-01-04 12:15:02 PST

Tags: , , , ,

So I’m setting up a new website for a client (an artist) so the easiest what I’ve always done is use Gallery. I’ve used 2.* for years and now 3.0 is out, and has been for a few months. So I figured why not give it a try.

I’ve never knows a website to have intermittent bugs, but Gallery3 has a good couple of them. Sometimes the spacing around photo/album items is just way too big, and after a mouse over, they jump position. A good half the time trying to delete an item takes you to a blank white page with a single option, “delete” and then that takes you to a dead end page with clearly an AJAX reply. But only sometimes. How do you track down bugs that only happen half the time? The default theme, wind, seems to lean heavily on jquery and I think, but am not sure, this is where the instability is coming from, but having not boned up to the level of jquery master, I certainly can’t dive in. Also apparently some of these issues aren’t even unknown, but still haven’t been fixed so we could assume better minds than mine have looked.

So that’s a bit disappointing and a waste of a days work that I get to eat. Gallery3 is not stable and usable. Back to Gallery2.

(Also, I’m not even really a fan of hiding things like item names by default and only showing on mouse over, it’s bad for the kind of galleries I’m putting up, but there isn’t even an option about that, and again, not recoding a ton of jquery code.)

Frustrated.

Python and SOCKS proxies: thread safe or unicode, pick one :(

2010-09-27 09:02:39 PST

Tags: , ,

Update (2011-05-06): Reader and generally helpful person Brian Visel has posted a solution in the comments. Thanks!

So I’m writing an app for a customer. The language used is python. The part I’m working on now is a bit like a web spider. It needs to be multi-threaded and handle Unicode and be able to rotate through multiple proxies. I’ve been using urllib2 to do the data fetching and it’s served me well so far and getting it to work with HTTP proxies wasn’t too problematical, just create a ProxyHandler, and then build_opener() with that and call opener.open() and volia, a proxied thread-safe Unicode solution.

However things took a turn for the more complicated when I tried to add SOCKS proxy support. It’s apparently not natively handled by urllib2 because it’s implemented at a lower level. A bit of googleing turned me onto my first not-a-solution: SocksiPy whos solution is to create a new socket that routes through the SOCKs proxy and replace the global socket.socket. Then urllib2 will of course have it’s traffic routed through the SOCKS proxy. The only problem is this is massively not thread safe if other threads are using other SOCKS or HTTP proxies.

So I did some more googeling and found that I could bypass urllib2 entierly and try pycURL a thin python wrapper for cURL. This actually was thread safe, however I can’t seem to get unicode support out of it for urls or returned data, which is a no go because the sites this code will be crawling use Unicode.

So I’m basically stuck not being able to handle SOCKS thread-safely with unicode in python :/ I have other parts of the project to work on now before my deadline which I’m moving onto. If I get done early I guess I’ll come back to this and see if I can find something better.

Valid XHTML 1.0!
Valid CSS!
Mindstab.net is proudly powered by WordPress
Entries (RSS) and Comments (RSS).
20 queries. 0.393 seconds.