2012-06-22

The Blog is dead, god save the Blog!

This blog got resurrected as: https://idea.popcount.org/



2010-05-29

Ipad thoughts

A quick bunch of fresh thoughts on Ipad (I had no experience with Iphone):

  • That the machine you'd like to take with you to the toilet or to the kitchen. Laptops are just too big.
  • Yes, one doesn't know where to tap/click. I didn't know where to tap to buy app on Appstore!
  • I'm missing the back button very much. Maybe that will go away when multitasking comes in.
  • The back button in Safari is very far away (top left corner, the furthest possible place for right-handed). It's also way too small.
  • It's hard to open and close tabs in Safari.
  • Jailbreaking firmware 3.2 works of the box. It's easier than navigating through Appstore.
  • Fingerprints on a glass or just dirty glass doesn't make any difference. That's cool.
  • The "home" physical button is sometimes in weird places - when you rotate Ipad in the wrong direction, which happens pretty often.
  • Obviously, for most things Ipad is useless without access to Wifi.
  • It is heavy.
  • It looks like Safari isn't swapped out/killed. You want multitasking - just open a new web page and leave it in the background.
  • Iphone apps are useless, they have too small screen. 2x zoom mode helps, but not a lot.
  • Wifi reception is pretty weak (as compared to other devices).
  • Numbers (spreadsheet from Apple) is brilliant.



2010-03-01

Memory matters - even in Erlang

I played with pretty interesting bug, or maybe a feature, in Erlang VM. Read more on LShift blog.



2010-01-07

Search in the versioned world

Whatever you are using: a blog, a portal, a forum or microblogging service, search works on current set of documents.


If I remove my last blog post, I expect a search to omit it in the search results. Search ought to include only the most recent state of documents.

But what if you can navigate through the history of document? What should appear in the search? We are used to see only the most recent version of document. For example in wikipedia search doesn't show documents that included searched phrase in the past.

This problem can appear also in other contexts. Consider using a VCS. I created some content, or code. I removed it as it looked unneeded. Some time later I realized that I need this content back. What can I do to recover it? Currently I have to manually review diffs in GIT/Mercurial/or whatever VCS. Shouldn't be a button somewhere: "search in history"?

So, how do we expect a search to work on versioned content? Is that feature useful at all?


2009-12-05

Undusting the blog

It's time to undust this blog a bit. Here is the list of my recent posts on LShift blog, with a word of comment:

  • Introducing RabbitMQ-Status plugin. Well, this plugin is intended to help sysadmins understand what's happening inside Rabbit. Stupid, simple, robust: the way sysadmin stuff should be.
  • Python Quirks. My rant about Python language. I like Python in general, but it has some dark corners and stuff that nobody uses or understands. I tried to sum up things that I would change in the language. This things should at least be better documented.
  • YDB: yet another key-value database. I was disappointed with BDB/TC write performance, so decided to hack together something. I wanted to prove if I could beat them.
  • Python Queue Interface for AMQP. Python Queue interface is nice. It's a shame that it's not scalable across many machines. I tried (and partially failed) to implement similar API over AMQP.


2009-08-27

Paypal/Ebay are broken

I just spent more than an hour trying to buy a thing from Ebay. I'm really pissed off. In short words: Fuck you ebay.

So the story goes like that:

  • I found an item
  • I clicked "buy it now". So far, so good.
  • They want me to register on ebay. I don't want yet-another-password and yet-another-stupid-user-name. Why the hell do I need that for? I'm trying to buy a thing, not an insurance. Why do you need my date of birth? Why do you want my phone number (landline only! mobiles not accepted!)
  • We're not yet finished with the ebay account. I need to verify my account! By phone or by credit card number? Well, after 10th try I managed to give them some random fake telephone number, as I don't have landline. So I must give them my credit card number. Start counting, 1st time I entered my credit card details.
  • This went smoothly. Have I already bought the item? Not really, I need to click through some stupid confirmation process. Yes I do want to buy this item. Yes I do mean that. Yes, I still haven't changed my mind.
  • I almost thought it's done, how mistaken I was. Now I need to pay. Why am I able to buy thing without paying... but let's leave that question for now. 
  • Click, click, yes I want to pay. I entered my credit card number (second time). Why do they need my date of birth to pay? My bank is ignoring this information, so it's absolutely useless there. Why do I have to put telephone number? Fortunately number 02012345678 works fine. Yes, do request more information from me, I'm always able to be smarter than you and enter fake data. The entropy in your database will grow. My frustration will increase.
  • Ebay kindly informs me that the transaction was not accepted or something like that. Well, that shouldn't happen. Lets' try again.
  • Back button, entered credit card details for the 3rd time.
  • Payment is directed to Paypal, and it returns this nice 500 page: 
  • Checked emails. Once. Again. Nothing. Well, let's just try again.
  • "We advise that you don't use your browser's back button." Fair enough, I clicked through all the ebay page to get again to payment site.
  • Entered credit card details. 4th time. 
  • The same paypal error. At this point I became really pissed off.
  • Okay, few years ago I had paypal account, let's try to pay using that, instead of entering credit card details by hand.
  • I created account in previous century, so obviously I have no idea what is the password. Clicked on "forgotten password link". 
  • Click Click Click. I got email with link from them. Normally at this point recovering password story ends. But not in paypal!
  • "How do you want to verify your account?". What the heck. Okay, to proceed I had to remember two of my security questions or give them the credit card number I used few years ago with that account. Of course I have no idea what my answers to the questions were.
  • Next fifteen minutes I spent trying to find my old credit card, it should be somewhere...
  • I was surprised to see that it should be still working, maybe I could try to pay on ebay using this card.
  • Clicked through ebay again, to get to the payment site.
  • Entered my credit card number. 5th time, but this time different card. Guess what happened....

  • Okay, so I do need to get access to that paypal account.
  • I followed the paypal-reset-password link again. Chose credit-card verification, and entered my credit card information for the 6th time.
  • After that point I was able to actually pay for this ebay item using paypal. But I still was forced to enter my credit card expiry date and cvv number, so I count it as entering my credit card details for the 7th time.

Dear Ebay, I just wanted to buy a thing. I don't care about paypal or ebay accounts, and I already have forgotten passwords. I don't care if you have my addresses (every one of three I used), I feel silly when you ask me for the date of birth.  

One more thing, requiring me to give you my credit card details seven times doesn't increase my confidence in the security of your system.




2009-06-01

Memcached protocol is not enough

For some time I'm trying to hack together a prototype of real-time-full-text-search-engine (RTFTSE!). I used a memcached-binary protocol as a communication protocol between backends and storage-nodes.

This is a pretty nice protocol and in fact there are strong reasons not to use ascii memcache protocol.



2009-05-19

The ministry of strange syscalls

My favorite syscall today:

$ man 2 readahead
"readahead() populates the page cache with data from a file so that subsequent reads from that file will not block on disk I/O."

I don't really know when should I use that, but it sounds cool. Just an implementation of prefetching on yet another layer. Wait a moment...

"readahead() blocks until the specified data has been read. "

I'm lost. If it blocks, why not to just use read(2)?


Second syscall:
$ man 2 madvise
"The madvise() system call advises the kernel about how to handle paging input/output in the address range beginning at address start and with size length bytes"
"The kernel is free to ignore the advice."


I don't get when I should use it.


And the last one:
$ man 2 mincore
"mincore() returns a vector that indicates whether pages of the calling process’s virtual memory are resident in core (RAM), and so will not cause a disk access (page fault) if referenced."

Cool, I can check if my memory page is in swap. I doubt it can help me in anything. If I don't want my pages to be in swap I'll just use mlock(2)...


Probably there are dozens of strange syscalls out there!



2009-05-18

GIT is ahead of SVN!

Git is the most popular DVCS right now. Google trends confirms that:



But the interesting fact is that Git is for the first time ahead of its grandpa Svn:


Though it's worrying that CVS is still alive, it should be dead ten years ago. Hopefully we can see a process of slow death, the end is inevitable:



2009-03-10

QCon mini InThe Brain session

I'll be speaking about simplified Etherpad clone at QCon London on Wednesday, March 11th.

The presentation will take place at Skillsmatter stand (booth number 10), at 16:45.




2009-03-04

Evserver, part3: Simplified Etherpad clone

This time I hacked together few open source projects. The result of my work, except from few minor bugs, seems to be working.


I don't know what shall I do with this project next. There are few possible options:

  • throw it away and forget about it
  • add support for markups other than reStructuredText
  • implement new features, like private documents or downloading the rendered markup in different formats
We'll see.

For now the priority is to create a quick presentation about this project, so I could show it during the break at Qcon.


2009-02-18

EvServer, part2: Rabbit and Comet


I just published the next part of the EvServer story.

I'm working at the third part. Though, my major concern is that I'm running out of proofreaders.


2009-02-06

EvServer, Introduction: The tale of a forgotten feature

This article about EvServer doesn't really show it as an innovative piece of software. I hope that future articles will.

It seems that the most important thing in this post is a blinking exclamation mark.


BTW, ItBlog seems to be dead. For historical reasons I placed my old posts from that blog here, here and here.



2009-01-22

I was wondering how Twitter search works



Few days ago I committed a blog post about Twitter search.

As a matter of fact, I'm not sure what is the solution to the question I asked there. On the other hand I presented some interesting (at least for me) numbers.

While writing this stuff I learned a lot and maybe this blog post will be a motivation for someone to actually create such a perfect system.

I'm still thinking about the persistence layer...



2008-12-15

Asynchronous libraries performance


I committed next post on LShift's blog.

BTW. This is my 100th post in this blog!



2008-12-06

Nmap Book finally released!


I'm really excited. After years of work Fyodor finally managed to finish The Book "Nmap Network Scanning". It should be a great gift for every security geek. You can preorder it on Amazon for 34$.



2008-11-21

AMQP? Maybe not yet. At least not with Python.


In my last Python project I thought about using AMQP messaging. A quick look at Python AMQP libraries shows two possibilities:
Both libraries are developed tightly with their brokers: Apache uses Qpid, py-amqplib uses RabbitMQ.

Apache Qpid

I started my adventure by downloading the source. They don’t have binary packages so this is what’s needed:
$ wget http://www.apache.org/dist/incubator/qpid/M3-incubating/qpid-incubating-M3.tar.gz
$ tar xvzf qpid-incubating-M3.tar.gz
$ cd qpid-incubating-M3/
Hmm. What’s next... No README files... After some googling I found that there’s getting stated guide. Following it this should be the magic command:
$ cd java/broker
$ PATH=$PATH:bin QPID_HOME=$PWD ./bin/qpid-server -c etc/persistent_config.xml
./bin/qpid-server: line 37: qpid-run: No such file or directory
Okay, let’s look for the missing file somewhere:
$ find ../.. -name qpid-run
../../java/common/bin/qpid-run
$ cp ../../java/common/bin/qpid-run bin
Next try:
$ PATH=$PATH:bin QPID_HOME=$PWD ./bin/qpid-server -c etc/persistent_config.xml
Setting QPID_WORK to /home/majek as default
System Properties set to -Damqj.logging.level=info -DQPID_HOME=/home/majek/b/qpid-incubating-M3/java/broker -DQPID_WORK=/home/majek
Using QPID_CLASSPATH /home/majek/b/qpid-incubating-M3/java/broker/lib/qpid-incubating.jar:/home/majek/b/qpid-incubating-M3/java/broker/lib/bdbstore-launch.jar
Info: QPID_JAVA_GC not set. Defaulting to JAVA_GC -XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError
Info: QPID_JAVA_MEM not set. Defaulting to JAVA_MEM -Xmx1024m
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/qpid/server/Main
Caused by: java.lang.ClassNotFoundException: org.apache.qpid.server.Main
That’s enough. I’m definitely too stupid to run this server and too busy to waste more time here (there is also other reason why I’m not really interested in qpid, read on).

py-amqplib + RabbitMQ

Here installation went smoothly.
$ python easy_install amqplib
The broker:
$ wget http://www.rabbitmq.com/releases/rabbitmq-server/v1.4.0/rabbitmq-server_1.4.0-1_all.deb
$ sudo dpkg -i rabbitmq-server_1.4.0-1_all.deb

The good

I also downloaded the py-amqlib sources and found out that the examples are working out of the box:
~/amqplib-0.5/demo$ ./demo_send.py  1
~/amqplib-0.5/demo$ ./demo_receive.py
Most common way of using Barry’s library is to write software that only consumes data from AMQP. Use one event loop that blocks forever and when something happens on AMQP the callback is executed. I have slightly different use case. In the project I’m writing I use asynchronous programming. This means that I have one event loop that is not the AMQP event loop. I want to run AMQP stack every time something happened on a socket. This is how it would look like in pseudocode:
while True:
nonblocking_consume_amqp_events()
<magically_wait_for_activity_on_amqp_socket>
Qpid python client has only blocking interface, so it’s impossible to write code like that. Fortunately there’s a nonblocking client in Barry’s library. There’s even an example:
./nbdemo_receive.py

The bad

The demo doesn’t work correctly when there are more messages in the queue and fails not very nicely with this traceback:
Traceback (most recent call last):
[...]
File "build/bdist.linux-i686/egg/amqplib/nbclient_0_8.py", line 74, in write
P' read_buf='&!amq.ctag-FC3ET7kTcqy/A93gJYQWqw==6!amq.ctag-FC3ET7kTcqy/A93gJYQWqw==myfan2

Above bug is triggered by basic_ack. So I removed that command - I stopped sending message acknowledges. Now the order of delivering messages got broken:
['1', '1']
['2', '1', '2']
['3', '1', '2', '3']
['4', '1', '2', '3', '4']
If the problem is with acknowledges, then one could say to use no_ack on a channel and just disable acknowledges. Sorry, this also doesn’t work.
I send ten messages:
['a_0', 'a_1', 'a_2', 'a_3', 'a_4', 'a_5', 'a_6', 'a_7', 'a_8', 'a_9']
Then another ten:
['b_0', 'b_1', 'b_2', 'b_3', 'b_4', 'b_5', 'b_6', 'b_7', 'b_8', 'b_9']
I receive:
['b_0', 'b_1', 'b_2', 'b_3', 'b_4', 'b_5', 'b_6', 'b_7', 'b_8', 'b_9',
'a_0', 'a_1', 'a_2', 'a_3', 'a_4', 'a_5', 'a_6', 'a_7', 'a_8', 'a_9',
'b_0', 'b_1', 'b_2', 'b_3', 'b_4', 'b_5', 'b_6', 'b_7', 'b_8', 'b_9']
While I was working at this issue I was worried that my Python process is eating more and more memory. After some debugging I discovered that amqplib has very nice memory leaks (okay, reference cycles).

The ugly

After few days I finally managed to create nonblocking code that I needed. But it’s really ugly:
def my_nb_callback(ch):
raise MException
conn = nbamqp.NonBlockingConnection('localhost',
userid='guest', password='guest',
nb_callback=my_nb_callback, nb_sleep=0.0)

ch = conn.channel()
ch.access_request('/data', active=True, read=True)

ch.exchange_declare('myfans', 'fanout', auto_delete=True)
qname, _, _ = ch.queue_declare()
ch.queue_bind(qname, 'myfans')

msgs = []
def callback(msg):
msgs.append( msg )

ch.connection.sock.sock.setblocking(False)
ch.basic_consume(qname, callback=callback)
while True:
msgs = []
<magically_wait_for_data_on_ch.connection.sock.sock>
try:
nbamqp.nbloop([ch])
except MException:
pass
unique_msgs_filter = {}
unique_msgs = []
for msg in msgs:
msg.channel.basic_ack(msg.delivery_tag)
if msg.body not in unique_msgs_filter:
unique_msgs_filter[msg.body] = True
unique_msgs.append(msg.body)
print '%r ' % (unique_msgs)


Back to RabbitMQ

For this simple research Rabbit works perfectly just out of the box. For a moment I even forgot that python library is not all of the software involved. But when I wanted to see some more details about it, things became messy:
$ rabbitmqctl --help
Password:
Which password would you like? Even if I know that, I’m not going to give any passwords away to see the help message, sorry. Next try:
$ rabbitmq-server --help
/usr/sbin/rabbitmq-server: 44: cannot create /var/log/rabbitmq/rabbit.log.bak: Permission denied
/usr/sbin/rabbitmq-server: 45: cannot create /var/log/rabbitmq/rabbit-sasl.log.bak: Permission denied
{error_logger,{{2008,11,21},{0,26,12}},"Protocol: ~p: register/listen error: ~p~n",["inet_tcp",einval]}
{error_logger,{{2008,11,21},{0,26,12}},crash_report,[[{pid,<0.22.0>},{registered_name,net_kernel},{error_info,{error,badarg}},{initial_call,{gen,init_it,[gen_server,<0.19.0>,<0.19.0>,{local,net_kernel},net_kernel,{rabbit,shortnames,15000},[]]}},{ancestors,[net_sup,kernel_sup,<0.9.0>]},{messages,[]},{links,[<0.19.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,377},{stack_size,21},{reductions,309}],[]]}

Conclusion
Even if the AMQP protocol is great, AMQP user stack still needs a lot of polish in my opinion.

Update #1:
rabbitmq-discuss thread


2008-11-18

How to escape from blocked unactivated Vista

Few times I was pissed off by this screen:
It means that I haven't activated Vista on time and Microsoft stopped liking me. The problem is that at least two times I was caught by this screen when I haven't got access to the net. I had password to WiFi, but haven't entered it yet. When Vista is blocked you don't have access to any networking settings so the password is useless. Yet another time, because of bug in Vista I needed to reenter my windows key. Of course the easiest way to get the key is to use KeyFinder rather than waste time on searching the key. Which, in the end, would probably be somewhere online - like in MSDNAA.

Fortunetely you can easily escape from the blocked Vista. This is how you can do it. After clicking "Activate now" this screen appears:

Just choose "buy a new product key online", which would spawn a new browser window. I had Opera set as default browser, but any browser should work. The trick is to enter "c:\windows\system32\cmd.exe" as the url.
We want to run this magic program.
Next, the well known black cmd dialog should appear. We're back home :) Just enter "explorer" to run the usual windows Start menu and task bar.
That's it. You now have normal access to your computer. You can then configure wireless networks or do whatever you like.

The only problem is to remember this instructions in case you won't have access to the internet.



2008-11-17

Tracing Python memory leaks

Tracing Python memory leaks.



2008-11-11

Simple inter-process locks

At LShift blog I wrote about simple inter process locks in Python.