dimanche 3 mai 2015

File Processing with Akka?

This is rather a design problem. I don't know how to achieve this in Akka

User Story
- I need to parse big files (> 10 million lines) which look like

2013-05-09 11:09:01 Local4.Debug    172.2.10.111    %MMT-7-715036: Group = 199.19.248.164, IP = 199.19.248.164, Sending keep-alive of type DPD R-U-THERE (seq number 0x7db7a2f3)
2013-05-09 11:09:01 Local4.Debug    172.2.10.111    %MMT-7-715046: Group = 199.19.248.164, IP = 199.19.248.164, constructing blank hash payload
2013-05-09 11:09:01 Local4.Debug    172.2.10.111    %MMT-7-715046: Group = 199.19.248.164, IP = 199.19.248.164, constructing qm hash payload
2013-05-09 11:09:01 Local4.Debug    172.2.10.111    %ASA-7-713236: IP = 199.19.248.164, IKE_DECODE SENDING Message (msgid=61216d3e) with payloads : HDR + HASH (8) + NOTIFY (11) + NONE (0) total length : 84
2013-05-09 11:09:01 Local4.Debug    172.22.10.111   %MMT-7-713236: IP = 199.19.248.164, IKE_DECODE RECEIVED Message (msgid=867466fe) with payloads : HDR + HASH (8) + NOTIFY (11) + NONE (0) total length : 84

  • For each line I need to generate some Event that will be sent to server.

Question
- How can I read this log file efficiently in Akka model? I read that reading a file synchronously is better because of less magnetic tape movement.
- In that case, there could be FileReaderActor per file, that would read each line and send them for processing to lets say EventProcessorRouter and Router may have many actors working on line (from file) and creating Event. There would be 1 Event per line
- I was also thinking of sending Events in batch to avoid too much data transfer in network. In such cases, where shall I keep accumulating these Events? and How would I know if I all Events are generated from inputFile?

Thanks

Aucun commentaire:

Enregistrer un commentaire