Last night I attend my first Drupal meet-up. Overall it was pretty entertaining. I was pretty impressed that so many people could be interested in drupal within the local vicinity. It was the Ann Arbor drupal users group. There were about 10 attendees and lasted a little over 2 hours. There were a wide range of people from developers, to consultants, to people whom are new. So it was nice to get to hear from a broad range of people and see people discuss many aspects of drupal.
Paul Resnick, a professor at the University of Michigan, gave a presentation on a module they are working on in conjunction with Drupal.org. It is called the "pivots module" and seemed very interesting. There was much discussion about finding solutions to problems within drupal and this module hopes to help solve some of those issues. The module's intent is to pro-actively search drupal.org and find post related to particular modules and then display results in a block on the project page. Many issues arose that were discussed such as scalability, storing of the data, how user analytics play into the module, performance, and performance of the algorithm (big O).
The module basically works giving it a key (project name) and it searches through forum post, support questions, etc. in hopes to find relevant solutions to problems. So it is basically a hash table where you have a key and a set of values that correspond to that key. A lot of the discussion pertained to how you assign weights or to particular terms in hopes to make certain post more relevant than others. Also as I mentioned above is the issue with performance on such a large site as drupal.org (over 100,000 nodes). The search algorithm (since is basically being ran on a hash table) will run at O(n) worst case, since you have to take one project and run through every node, but then you have to do that for every project page (p) so it's actually O(p * n). But the algorithm also implemented incremental indexing so once the majority of the processing has been completed then it isn't much work after that.
One thing to look at maybe how the giant, google, does it, and that they use sawzall and map reduce to analysis mass amounts of data in a similar sense. They also talked about off loading the data processing asynchronously via another server, but had the issue of javascript not allowing data transfer across multiple domains, I suggested the use of a proxy and I know this technique works because the Jikto javascript scanner uses this method as well as google translate. And that's why there is much discussion of inherent security issues in AJAX.
Overall I very much enjoyed attending the group, got to meet some interesting people and I think that the pivots module is something that would be very useful in any community based drupal install. Sadly I won't be able to attend any more meetings because I will be heading back to Michigan Tech to finish my degree at the end of this month.
.jpg)