2

My problem is that that a proprietary SaaS platform im developing on/for only provides log files via WebDav. During development this log files can get quite large by the end of the day (think 200 Mb+) but are very detailed and useful when trying to track down a "general error".

What happens to me now that to look at the log file I have to download the 200 Mb file every single time (it does not get automatically recreated if i delete it :( ), meaning that even on a good connection you have to wait 1-2 min for the file to be downloaded.

So the actual question again: Is there any tool out there that can take a look (for example) at the timestamp of a file every 5 seconds and just download the added chunk (using the calculated difference in size).

1
  • +1. Good question. I am working on a way to quickly sync files across WebDAV to the client system and looking into zsync. I have provided a high-level solution below for your requirement.
    – Kent Pawar
    Commented Dec 2, 2013 at 14:34

2 Answers 2

1
  • WebDAV is an extension to HTTP and by the current available methods[1] there isn't any support for the same.

  • You could look at the alternative methods for partial file transfer over HTTP like zsync that will only transfer the changed content.


1. zsync

Abstract : This document describes the thinking behind zsync, a new file transfer program which implements efficient download of only the content of a file which is not already known to the receiver. zsync uses the rsync algorithm, but implemented on the client side, so that only one-off pre-calculations are required on the server, and no special server software or new protocol is required to use zsync.

UPDATE:

As per the rsync algorithm:

  • One side calculates the checksums of each distinct block of data, and sends it to the other end. The other end then does a rolling checksum through its file, identifying blocks in common, and then working out which blocks are not in common and must be transmitted.

  • In rsync, the server does all the hard work while in zsync, the client requesting the data does all the hard work of applying the rsync rolling checksum and comparing with the downloaded checksum list.

From [4]

... So, we make it the server which calculates the checksums of each distinct block. Because it need calculate only one checksum per block of data, and this is not specific to any given client, the data can be cached. We can save this data into a metafile, and the client requests this data as the first step of the process. This metafile can simply be at another URL on the same — or even a different — server.

The zsync client will pull this metafile. It then runs through the data it already has, applying the rsync rolling checksum and comparing with the downloaded checksum list. It thus identifies the data in the target file that it already has. It then requests the remaining data from the server. Since it knows which data it needs, it can simply use HTTP Range requests to pull the data. ...

  • So HTTP Range headers need to be supported on the server-side.

2. Alternatives to zsync

  • But if these log files only grow and thus the existing content will not change then HTTP Range header would suffice. [2][3]

Reference:

  1. http://www.webdav.org/specs/#dav
  2. http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35
  3. http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p5-range-latest.html#range.requests
  4. Zsync theory
2
  • Thanks for zsync, will have a look at it, it sounds very interesting. Regarding Range: unfortunately the server I am talking to does not support it :S (like so many other things).
    – KillerX
    Commented Dec 4, 2013 at 9:56
  • Your Welcome. But Zsync would require Range headers to be supported on the server side. I have updated my post to highlight this..
    – Kent Pawar
    Commented Dec 4, 2013 at 11:58
1

If the client is a Windows computer, try this: Map the webdav share as a network drive and run tail command on the file.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .