Vijay Samuel's Blog

Drizzle, now a part of Software in the Public Interest

Posted by: vjsamuel on: October 6, 2011

Its been a while since I ve blogged but I couldn’t think of a better time to resume blogging than when Drizzle was officially became associated to Software in the Public Interest. I ve been a part of Drizzle for almost a year and a half now and my passion for Drizzle seems to grow every day. It is always good to see changes that happen for good and this is one of them I guess. Now that Drizzle is a part of SPI, it has a legal entity behind it which is always good. How can you benefit from this you may ask. If you are a US tax payer, then any donation that you make will be tax deductible and all your valuable contributions will be used towards the betterment of Drizzle. The easiest way to donate is using a credit card at Click & Pledge. The SPI website lists some alternative methods such as using a cheque.  So, please do make your valuable contributions towards Drizzle. As always I feel proud to be a part of the Drizzle family and will continue to strive for the betterment of Drizzle. :) :) :)

Stored Procedure Interface for Drizzle

Posted by: vjsamuel on: March 23, 2011

I ‘ve been doing some reading on Stored Procedures and how they are being defined and executed. These are some of the points which I think should be covered in our Stored Procedure Interface and some of my suggestions. I’m open to suggestions and criticism. According to what we had discussed in the channel we need make the stored procedure interface pluggable. So, a part of the interface will reside within drizzled and the client part of the interface will reside within the plugin itself.

I personally feel we could work on this interface on a series of 5 to 6 iterations.

1) Write grammar for our stored procedures, a lexical analyser and some parser code using flex and bison. I think we could abide to the SQL standards as much as possible from the earlier stages so that we don’t need to refactor much later on. After we write the grammar we need to test thoroughly!!! The earlier we find bugs the better.

2) Update sql_lex and sql_yacc so that the new keywords STORED and PROCEDURE are understood by our SQL grammar. Update the client code so
that we are able to use the STORED and PROCEDURE keywords. Update bison code to CREATE and DROP Stored Procedures. Use EXECUTE_SYM to execute the stored procedures.

3) We need to store our stored procedures on tables so we will have to write protobuffers for the new fragment of code that is going to enable us to store the stored procedures on the tables.

Now, after the third pass we could merge the code into trunk and _technically_ we should be able to run stored procedures that have only SQL statements. Once we get this working we should be able add the rest of the features with patches.

4) Determine a convention for denoting variables. SQL Server uses @ prefixed to names to denote that the given name is a variable. Enable stored procedures to accept input parameters. We will be needing to re write protobuffers because we need to use these variables in our tables and give special meaning to them in the future.( i.e if they are IN, OUT or INOUT). The interface will not support IN, OUT and INOUT in this pass though.

5) Add support for IN, OUT and INOUT. We need to think of a good way to prevent modification of IN variables. I do not know how to make a table
entry readonly. We also need to be able to return values in the case of OUT and INOUT variables. We could have a column that denotes if a variable is IN OUT or INOUT and based on the entry give write permissions on that variable. Just a suggestion.

6) Add SET to the stored procedures grammar. This will enable us to use local variables. The protobuffers need to be re written so that the local variables can be stored in out tables.

I need to do alot of reading on google protobuffers and brush up on flex and bison. The first three iterations are hardest according to me. I hope I made some sense in these notes.

Please do comment on any mistakes that I ‘ve made so that I could work on them. Better ways on approaching this problem are also welcome. :)

A year at Drizzle

Posted by: vjsamuel on: March 12, 2011

Its almost a year since I first became a part of the Drizzle family and its been an amazing experience so far. I first came to Drizzle with an aspiration to become a Google Summer of Code intern. I did realise that dream under the guidance of my mentor Brian Aker. Google Summer of Code was awesome. I got to work on  the command line options processing system alongside Brian, Monty Taylor and Jay Pipes. Summer of Code 2010 was a huge success and I did manage some good work into trunk.

Even though summer of code was over I did hang around at Drizzle and do some struct to class refactoring, a lot of bug fixes and some tamil utf-8 test cases. I feel proud to work along side such great people and learn new technologies with them as Drizzle grows. The number of friends I have made here are countless, Stewart Smith, Andrew Hutchings, David Shrewsbury, Patrick Crews, just to name a few. :)

I recently approached Brian and Monty regarding what I could do for the coming Summer of Code and Brian was quick to respond. :) Stored Procedure Interface was his reply and SPI is what I am going to propose for the upcoming Summer of Code. Brian is very excited about it and so are many devs at #drizzle including myself. I will be coming up with  a description regarding this project in a few days time and I ‘m open to any comments.

I almost forgot about the GA! The GA of Drizzle is coming out in a few days. I’m pretty excited to be a part of the team and I will definitely contribute more to Drizzle.. Thank you guys! I do love working at #drizzle. :) :) :)

Why search when you can “Grabble”

Posted by: vjsamuel on: December 4, 2010

I guess it has been a while since I updated my blog probably because I ve been caught up in end of semester activities. During that time I got to work on a very interesting project as a part of my college curriculum. I wrote a “file system search engine” using C ++ and a lot of standard libraries from Boost and I named it Grabble.

There are a lot of file system search engines out there and many may wonder why would I waste my time on hacking on such an area when there are already a lot of good tools out there. I have noticed that some of the file system search engines do take a while to respond to queries. So it was kind of a “Need For Speed” venture.

In the end, I did come up with a search engine that is quite fast, may be even faster than the average engine out there but it does come with a huge constraint of memory requirement. I have given no regard to memory :P . There is a HUGE inverted index inside the heart of the search engines server which is used to service the clients queries. Since an inverted index is used it only takes unit time to retrieve data from the server. The system follows a client-server tcp connection which I had implemented using boost::asio. There are alot of threads running in the system which were implemented using boost::thread. The whole software is built on Monty Taylor. Thank you Monty!!!

As far as performance is concerned, Grabble can retrieve data in under 10ms and report the absence of data in exactly 10ms. There are some bugs(a lot actually ;) ) in the system but I would say that the system is still in its infancy and could be tweaked to attain perfection. So, if you are looking for a very high speed search engine then I guess its probably safe to say that you’d rather grabble than search for you data. ;) If you have any queries regarding grabble you re always welcome to PM me.

Being a part of the Beta release

Posted by: vjsamuel on: October 6, 2010

I’m probably the last of all the devs at Drizzle to blog about the big Beta release but I haven’t had a lot of time on my hands these days to do it. It has been six months now since I first came to the Drizzle community to try and make it into Google Summer of Code and I ‘ve seen Drizzle go through a lot of changes. One of the changes have come from myself which is replacing my_getopt with boost:: program_options which I had done as a part of Google Summer of Code under the guidance of Brian Aker. Now, the Beta may not be a big deal to many but it is a huge deal to me as this is the first time I’m a part of something huge. :) :)

Even though summer of code has ended and I am still spending time in #drizzle in order to contribute patches, utf-8 test cases and bug fixes to Drizzle. (Its difficult to leave such good people and the Drizzle code base is awesome to work on ;) ). I will continue to contribute to Drizzle when ever I had time and see Drizzle grow as time goes by. Cheers to all my mates at #drizzle for making it so far and my wishes to them to take Drizzle even higher.

The last of the “my_long_options structs”

Posted by: vjsamuel on: August 16, 2010

As Google Summer of Code 2010 comes to an end my work of re factoring the commandline options and configuration file processing system using boost::program_options also is nearing the grand finale which would probably mean the complete removal of my_getopt which was used to solve the above stated purpose. Over the past couple of weeks I have been working on innodb, embedded_innodb and the kernel, the three huge patches which were left.

I had stated in my last post that the two engines were in the process of being merged into trunk and I again state in this post “the two engines are in the process of being merged”. One may ask why I has taken so long to get it into trunk, which is a very logical question to ask. The answer to that is “MEMORY LEAKS”. I had failed to free memory after using strdup() causing performance deterioration. Most people would have got a good laugh out of that. :P :P . Some more copy/paste errors seemed to prove how idiotic I can be at times. ;)

With all that aside Monty and I dived into the kernel with high hopes of getting the job done within the next release which is tomorrow constantly being supervised by my mentor Brian. I helped lay the basic ground work for the kernel patch. I removed the last my_long_options from the kernel, did the required operations to assign the variables their required defaults, performed limit checks and block_size corrections with almost 30 notifier() ( :O :O Lot of copy/paste involved in that process :P ).

Monty is now working on making the kernel patch usable. The kernel is too complex for me to handle the entire patch by myself. I really hope that he completes the patch before the release. :) I ve really had a wonderful time working with such wonderful people. I ‘ve been blogging as a summer intern for the past three months. My next post will probably be as a permanent developer of Drizzle. Thank you guys! Cheers!

What’s new in the plugin world!

Posted by: vjsamuel on: August 10, 2010

Its been more than a month since i started working on re factoring the plugin system in order to support the new option processing system which uses boost::program_options. I ‘ve had tonnes of help from Monty Taylor who had worked on the interface. I ‘ve also had constant help from Brian Aker. I ‘ve slowly but surely re factored the entire command line and the last of the re factored plugins are getting ready to be merged into trunk.

Monty has already described to the Drizzle community the changes that the new system has incorporated but I feel that it is my duty to put the changes up as well. The new system does not support underscores unless it has been specified explicitly by the developer. All options are now of the form –plugin-name.option-name=value. One bug that has come up because of my work is that the options that take values such as 1M can no longer take such values, instead they take values in multiples of 1024*1024 but I will be fixing this issue when I work on the kernel. (YAY! :) :) )

One issue that concerns me is that when I worked on the plugin system it came to my notice that most of the plugins are either very poorly tested or not tested at all. So this message goes out to all the Drizzle Plugin Devs across the globe, “Please test the plugins!!!”.  I guess that I will be done with working with the kernel by the end of the week and then it will be “Ba Bye my_getopt!!!” ( :P :) I guess most devs will be see that day come!)

Progress with Plugins

Posted by: vjsamuel on: July 21, 2010

It’s been almost a month since I finished refactoring the entire client so the client is completely independent of my_getopt(Hurray!:-)). For the past one month I ve been working on the plugin side of drizzle. With the help of Monty’s new interface I ‘m slowly bringing boost::program_options into the commandline options of the plugins. I ve done almost 10 plugins now. So, people who have contributed plugins to drizzle that have commandline options please have a look at the changes I ‘ve made to your code.
  I ll complete working on all the plugins within a weeks time hopefully and finally hack the kernel.:-)

The importance of notify()

Posted by: vjsamuel on: June 13, 2010

I recently refactored the drizzle client and came to know the importance of the program_options::notify(). Initially I was including this notify() blindly without knowing what it does but it turns out that it is very important when you use variable that make references to your variables map.
For instance, if you have an option like
(“foo”, po::value<string>(&opt_foo)->default_value(“”), “An option named foo”)

After your command line has been parsed, the notify does the job of assigning the argument of –foo to opt_foo.

I didn’t know this and spent hours together trying to find out why my variables weren’t getting the values coming in via the command-line. :) Now that I ‘ve found out the importance of notify() I have one less bug to worry about in the future. :) :)

Slap given a Boost ;)

Posted by: vjsamuel on: May 24, 2010

I ‘ve finally completed refactoring both command-line options and the configuration file processing and filed in my merge proposal. It was a long process and now that it ‘s finally done there are a lot of changes in drizzleslap. (Anyone whos currently working on slap would get a serious head ache when he re merges with trunk after my code is merged into trunk!!! ;) :) )

Some of the changes that have been incorporated are as follows:

1) The struct my_long_options[] has been removed and program_options::options_description is used to define an object long_options.

2) The command line parsing and configuration file processing is taken care of by the parse_command_line() and the parse_config_file() instead of using custom code and so one won’t be able to find the handle_options any more.

3) get_options() has been renamed to process_options since it job now is just to process option values and it now only takes void as a parameter.

4) The internal::load_defaults() is no longer used and is replaced by the parse_config_file from boost.

5) Drizzle will no longer use the single configuration file drizzle.cnf instead it will be using mulitple configuration files such as drizzleslap.cnf and client.cnf which will be found in ~/.drizzle and SYSCONFDIR/drizzle directories. The user configuration file over rides the system configuration file. The command line over rides both configuration files.

6) get_one_option() has been removed since it is of no use any more.

Similar changes will be implemented in the server side soon. Hopefully all the custom code used for command line and configuration file parsing will be replaced by standard boost:: program_options within 2 to 3 months. :)

My Timeline

January 2012
M T W T F S S
« Oct    
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

I, Me and Myself

My Blog Stats

  • 2,167 hits

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 1 other follower

Follow

Get every new post delivered to your Inbox.