ruby - EventMachine: What is the maximum of parallel HTTP requests EM can handle? -


I am creating a distributed web crawler and am trying to get the most out of every single machine's resources. Use the em-http-request to create an asynchronous HTTP request through an event, and use em-http-request to create an asynchronous HTTP request. For now I have 100 iterations that run at the same time and it seems that I Can not move beyond If I increase many iterations then it does not affect the crawling speed. However, I get only 10-15% CPU load and 20-30% network load, so the crawl in many rooms is fast.

I am using Ruby 1.9.2. What is the method to improve the code effectively to use resources or am I also doing it wrong? def start_job_crawl @ redis.lpop @queue do | Link | If link.nil? EventMouseine :: add_timer (1) {start_job_crawl ()} and # parsing link, using the asynchronous HTTP request, content pars (link) end and end and #main reactor loop emeran {e.m.k.k.q. @ RDIS = EM :: Protocol :: Redis.connect (: Host => "127.0.0.1") @Radisback. Code | Say "Radis error: # {code}" end # 100 parallel 'thread' This EM :: Iterator. Increase NY (0..99, 100) Unit | Number, iter | If you are selecting ()

(Which is the default for EM), the maximum is 1024 because limited to 1024 file descriptors () is limited.

Although it seems that you are using kqueue , then it should be able to handle more than 1024 file descriptors at a time.

Comments