Would you like to receive similar articles straight to your inbox?

Faster Inter-Process Communication

Fast codechecker
A situation arised where we needed to exchange signals between a python process and a c++ one . Whenever a http request is fired to our server , a python thread handles the request . The python thread inserts the data in our MySQL table , with the column ‘processed ‘ = ‘false’ AND waits for the c++ daemon to process the data . The C++ daemon after processing the data writes to the MySQL db and also sets processed = true . The python processes takes this data and serves the request . This is how it happens .
[python highlight=”7,8,9,14″]
sql=”INSERT into tablename(data,processed)”
cursor.execute(sql); #insert into db with processed = false
processed = False
insertid = cursor.lastrowid #ID of the column we inserted
while not processed :
#loops untill the data is processed by the C++ daemon
sql= “SELECT * FROM tablename”
sql+=” where `id`=insertid and processed=`True`”
# sql to get the data once its processed
cursor.execute(sql )
row = cursor.fetchone() # executes the sql
if row != None and len(row[0]) > 0:
# TRUE if the data has been processed
processed = True
#to prevent overload on mysql server
And in the background the c++ daemon does this
[cpp hightlight=”9″]
//get any data waiting to be processed
sql=”SELECT id,data from tablename where processed =`false`”;
id,data = mysql(sql) ;
mysql(insert into tablename(data)values(data) where `id`=id);
We had a performance hit ,and while profiling we realised that mysql was getting hit very badly and the python thread was sleeping for most of the time . if we remove the time.sleep(1) in the python code , mysql was getting hit even more badly . Clearly we needed a better IPC mechanism . Yup 3rd year college linux course !! .. After some googling we landed up using named pipes and this is how we implemented it .
[python highlight=”6,7,8″]
sql=”Insert into tablename(data,processed)”
processed = False
insertid = cursor.lastrowid
os.mkfifo(insertid) #making a pipe with the name insert id
#opening the pipe for reading
#the process waits at this line for the pipe to receive some
pipe = open(insertid, “r”) data
data = pipe.readlines()[0]
#after receiving the data from
meanwhile in the background
[cpp hightlight=”9,12″]
sql=”SELECT id,data from tablename where processed =`false`”;
id,data = mysql(sql) ;
//get any data waiting to be processed
sql=insert into tablename(processed)values(“True”) where`id`=id
//set processed as true
pipe = fopen(id, “w”);
/*open the pipe for writing
notice we open the same pipe(whose name is the insertID)
on which the python process is listening for data . */
fprintf(‘pipe’,data); //write the processed data to the pipe
/*as soon as the daemon writes to the pipe
the python process knows that
the c++ daemon has processed the information*/
This way we saved our MySQL db and also made sure that our app doesnt waste time sleeping (time.sleep(1)) when a request is waiting to be served .
Take a look at how it improved our performance . Before implementing named pipes , it took around 4.2 seconds to compile and test a simple hello world program .

After implementing named pipes , we were able to bring it down to a mere 1.2 seconds 🙂

Comments (7)

  • Great !! first time I have seen some application of the concept I learned in my Netprog class. One common issue I faced while programming with Named pipes is , I used to forget to close them and try to re-read from them. Also we can’t keep too many pipes open at a time. I think 1024 is the limit.[I am not sure this is way back in 2003]. Hope you have taken care about all these kind of error checks. One suggestion is keep the existing old logic also in place so that if the pipe fails still user will get the results instead of unexpected results.

    • Krishna Chaitanya
    • February 22, 2011 at 3:12 am
    • Reply
  • If there is quite a bit of such interaction in your application, I would use a messaging server like ActiveMQ or something.
    But if this were a one off thing; you should ensure that your C++ daemon is not polling continuously on your DB. Instead of opening a new pipe for every row, you could have two pipes between the applications. After inserting a record, the python side sends the record-id to the C++ side. The C++ daemon should query based on the record-id (not on “WHERE processed=False”), do the job and return results to the Python side.
    You’ll have to handle the end cases of “when c++ daemon is not available, when python side is not available”, etc.. Thats where a Message-oriented-middleware comes in 😉

  • Using a While(1) loop to pool to database is almost always a bad idea.
    You may consider using a message queue (ActiveMQ or RabbitMQ) where python thread put the tasks ( or insert id) and c++ daemon pick it up from it using blocking get call.
    You may also consider using another queue for c++ daemon write/python thread read instead of using Pipes, as they are limited in numbers.

    • Mohit Ranka
    • April 3, 2011 at 2:40 pm
    • Reply
  • This is the first time I have ever seen a select query running in while (1) loop, I can only feel sorry for the database getting polled *hard* ;).
    As Joe said before me, I think using a messaging server like ActiveMQ might give you a better performance and system reliability.
    You can also try apache thrift OR protocol buffer for efficient cross-platform IPC calls. Having worked with thrift myself, I will suggest running C++ code as a thrift server instance – http://wiki.apache.org/thrift/ThriftUsageC%2B%2B , and let python client make blocking calls to the server for data processing. This setup would avoid any database polling whatsoever.
    I like using named pipes for blocking python process in this case, as I had the exact same idea for blocking linux shell scripts some time back. I have written about it here -http://42bits.wordpress.com/2010/12/05/notify-sh/

  • Hi guys I am a beginner and received an email from Oracle regarding JavaOne & Oracle Develop conference 2011 to be held in Hyderabad..
    I am hoping attend Oracle Develop conference this year. Expecting several sessions about application Grid and Oracle WebLogic Track and many more is much needed.
    Here’s a link I received: http://bit.ly/e1B2Ez
    What do you guys suggest????

    • Therohanmalhotra
    • April 13, 2011 at 12:20 pm
    • Reply

Leave a Reply

Your email address will not be published. Required fields are marked *