The goal is to provide developers of file-sharing networks, browsers or WWW-indexing bots with a universal library to obtain simple keywords and meta data to match against queries and to show to users instead of only relying on filenames. libextractor contains a shell command extract that, similar to the well-known file command, can extract meta data from a file an print the results to stdout.
Currently, libextractor supports the following formats: HTML, PDF, PS, OLE2 (DOC, XLS, PPT), OpenOffice (sxw), StarOffice (sdw), DVI, MAN, FLAC, MP3 (ID3v1 and ID3v2), NSF(E) (NES music), SID (C64 music), OGG, WAV, EXIV2, JPEG, GIF, PNG, TIFF, DEB, RPM, TAR(.GZ), ZIP, ELF, S3M (Scream Tracker 3), XM (eXtended Module), IT (Impulse Tracker), FLV, REAL, RIFF (AVI), MPEG, QT and ASF.
Also, various additional MIME types are detected.
libextractor is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Download.
Subversion access.
You can access the current development version of libextractor using
<i>$ svn checkout https://ng.gnunet.org/svn/Extractor</i>
A Java binding for libextractor is in
<i><big>$ svn checkout https://ng.gnunet.org/svn/Extractor-java</big></i>
A Mono binding for libextractor is in
<i><big>$ svn checkout https://ng.gnunet.org/svn/Extractor-mono</big></i>
A Python binding can be found under
<i><big>$ svn checkout https://ng.gnunet.org/svn/Extractor-python<br /><img style="max-width: 800px;" src="http://lh3.ggpht.com/_P-H3qjYWw0A/TE06QP05VUI/AAAAAAAACcM/yDNanimpgFo/%5BUNSET%5D.png?imgmax=800" width="552" height="220" /><br /></big></i>
An initial draft of a PHP binding can be found under
Debian .deb package.
The debian package can be downloaded from the official debian archive. The extract package can be found under Utilities and the library under Libraries. The respective packages for libextractor are extract, libextractor and for development libextractor-dev. Backports for Debian Stable are also available.
Tar Package. The latest version can be found on GNU mirrors. If the mirror does not work, you should be able to find them on the main FTP
server at ftp://ftp.gnu.org/libextractor/.
Latest release is libextractor-0.6.2.tar.gz.
Latest Java-binding is libextractor-java-0.6.0.tar.gz.
Latest Mono-binding is libextractor-mono-0.5.23.tar.gz.
Latest Python-binding is libextractor-python-0.5.tar.gz.
1. Introduction | What is GNU libextractor. | |
2. Preparation | What you should do before using the library. | |
3. Generalities | General library functions and data types. | |
4. Extracting meta data | How to use GNU libextractor to obtain meta data. | |
5. Language bindings | How to use GNU libextractor from languages other than C. | |
6. Utility functions | Utility functions of GNU libextractor. | |
7. Existing Plugins | What plugins are available. | |
8. Writing new Plugins | How to write new plugins for GNU libextractor. | |
9. Internal utility functions | Utility functions of GNU libextractor for writing plugins. | |
10. Reporting bugs | How to report bugs or request new features. |
If you liked this article, subscribe to the feed by clicking the image below to keep informed about new contents of the blog:
0 comments:
Post a Comment