source: archiver @ f907af8

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @5acb101   7 years vdv Rewriting resources with no archived out links
(edit) @4473ad6   7 years vdv Support HTML @background
(edit) @94d3351   7 years vdv Map application/xhtml+xml to .html
(edit) @5e2b674   7 years vdv Store the craw log into the archive
(edit) @c25b18f   7 years vdv Support HTML embed/@src
(edit) @16ef797   7 years vdv Trying to guess content types
(edit) @bc581fa   7 years vdv Adapting relative links to match the structure of the browsable archive
(edit) @bf29805   7 years vdv Cleaning the algorithm to compute friendly local names.
(edit) @cfaf8ae   7 years vdv Adding XSLTUnit tests for the local-name function.
(edit) @a7c3525   7 years vdv Hmmm... HTML should be serialized as HTML, of course!
(edit) @c79bd8e   7 years vdv Forcing HTML content type for XHTML documents
(edit) @9bce34f   7 years vdv Rewriting links in HTML and CSS resources within WARC archives
(edit) @5b162a6   7 years vdv WARC mail extract loop
(edit) @466d447   7 years vdv Generating a resource index to facilitate further processing.
(edit) @675ed04   7 years vdv Download and convert the crawl log
(edit) @6f64c7f   7 years vdv Handling payload content types
(edit) @be1a361   7 years vdv Implementing yet another WARC parser (the heritrix one didn't work well …
(edit) @307b6d2   7 years vdv Adding whois records
(edit) @22c3028   7 years vdv First stab of WARC packaging.
(edit) @51c2058   7 years vdv Queue an action to package the Heritrix WARC.
(edit) @b346236   7 years vdv Adding a mechanism to delay actions in the queue.
(edit) @3bcb813   7 years vdv Unpause Heritrix job.
(edit) @f25a924   7 years vdv Modifying the way the Heritrix (spring) config file is generated since it …
(edit) @a3fa073   7 years vdv Update to follow changes to Orbeon Forms experimental features…
(edit) @a1dc635   7 years vdv Update to follow changes to Orbeon Forms experimental features…
(edit) @57daa70   7 years vdv Now building and launching Heritrix jobs…
(edit) @be2f974   7 years vdv Update to follow changes to Orbeon Forms experimental features…
(edit) @c4c4108   7 years vdv Starting to write pipeline actions that interact with an Heritrix server
(edit) @ad35672   7 years vdv Still work in progress, but the WARC archive now validates with …
(edit) @ba51ddf   7 years vdv Starting to support content lengths in warc archives
(edit) @9d99928   7 years vdv Removing the last action from the queue
(edit) @01a6690   7 years vdv First version that can produce a packaged archive.
(edit) @5ac9ea9   7 years vdv Packaging resources that have not been rewritten…
(edit) @0e7bdd1   7 years vdv Adding a basic squeleton to generate what should ultimately be a WARC …
(edit) @3d18e9d   7 years vdv Adding a mechanism to avoid to archive multiple times the same resource …
(edit) @cf97a98   7 years vdv Fist version supporting CSS rewriting
(edit) @750ccaa   7 years vdv Dummy (passthrough) implementation of the CSS support…
(edit) @16cc943   7 years vdv Refactoring before supporting CSS
(edit) @11027c0   7 years vdv Moving action pipelines in their own directory
(edit) @a0bd1a5   7 years vdv Adding a priority mechanism
(edit) @6b10b3e   7 years vdv Removing an xsl:message.
(edit) @fd2ca8f   7 years vdv Adding timestamps to the archive indexes
(add) @c71d5b2   7 years vdv Starting to implement a version based on Orbeon's XPL or the archiver.
Note: See TracRevisionLog for help on using the revision log.