AnyBook4Less.com | Order from a Major Online Bookstore |
![]() |
Home |  Store List |  FAQ |  Contact Us |   | ||
Ultimate Book Price Comparison Engine Save Your Time And Money |
![]() |
Title: Managing Gigabytes: Compressing and Indexing Documents and Images by Ian H. Witten, Ian H. Witten, Alistair Moffat, Timothy C. Bell ISBN: 1-55860-570-3 Publisher: Morgan Kaufmann Publishers Pub. Date: 15 May, 1999 Format: Hardcover Volumes: 1 List Price(USD): $62.95 |
Average Customer Rating: 4.7 (10 reviews)
Rating: 4
Summary: Good introduction to searching/indexing in data.
Comment: MG gave a good introduction to the components of practical Information Retrieval (IR). You can clearly see that the authors have a genuine interest in the field! But, I would like some more theoretical analysis of the algorithms used(i.e. O-notation), and more focus on parallell implementations of IR systems. Another book related to the same area worth mentioning is "Modern Information Retrieval".
Rating: 5
Summary: Great Book on Information Retrieval
Comment: Managing Gigabytes is the best book out there on information retrieval. If you're interested in implementing your own IR system, there's nothing available that comes close to this book. But the book is good not just because it's the only one out there: the writing is excellent, the algorithms are presented clearly and explained well, and the coverage is thorough. Additionally, the coverage of compression algorithms is the best I've found in any book. All algorithms and pseudo-code in the book are presented clearly enough such that any competent programmer should be able to implement them. If all else fails, however, the free downloadable source code for the mg system can fill in any gaps.
All in all, this is the best computer science book I've purchased in years. I wish all CS books were written like this one: it doesn't skimp on the theory or on the implementation details.
Rating: 5
Summary: The Wonderful Thing Is: It's the Only One
Comment: This is the only book there is that will actually teach you how to build an information retrieval system (aka search engine). It discusses all the algorithms and tradeoffs, and comes with free downloadable source code to experiment with. Some of the material is standard, but covered in more implementation detail here than anywhere else. Some of the material is novel: you won't find better coverage of compression unless you hand-assemble twenty research papers, and reverse-engineer them to figure out how they're implemented. But with "Managing Gigabytes", it's all here. (Although, after a particularly envigorating discussion of how to string together a bunch of techniques to compress their corpus and save a couple 100MB, I did a check and found you could buy 512MB of RAM for less than the cost of the book. Knowledge is Power, but sometimes a little cash is more powerful.) The only negative is that this book is not called "Managing Terabytes", as the first edition promised/threatened it might be. RAM and disk are cheap, but not that cheap, and for now terabytes (and sometimes petabytes) are managed only by NASA, Google, and a few others. I can't wait to see the third edition!
![]() |
Title: Modern Information Retrieval by R. Baeza-Yates, Berthier Ribeiro-Neto, Ricardo Baeza-Yates ISBN: 020139829X Publisher: Addison Wesley Publishing Company Pub. Date: 01 February, 1999 List Price(USD): $50.00 |
![]() |
Title: Mining the Web: Analysis of Hypertext and Semi Structured Data by Soumen Chakrabarti ISBN: 1558607544 Publisher: Morgan Kaufmann Publishers Pub. Date: 15 August, 2002 List Price(USD): $54.95 |
![]() |
Title: Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization (Natural Language Processing, 5) by Peter Jackson, Isabelle Moulinier ISBN: 1588112500 Publisher: John Benjamins Publishing Co Pub. Date: 01 June, 2002 List Price(USD): $39.95 |
![]() |
Title: Foundations of Statistical Natural Language Processing by Christopher D. Manning, Hinrich Schutze ISBN: 0262133601 Publisher: MIT Press Pub. Date: 18 June, 1999 List Price(USD): $75.00 |
![]() |
Title: Modeling the Internet and the Web: Probabilistic Methods and Algorithms by Pierre Baldi, Paolo Frasconi, Padhraic Smyth, Pierre Baldi ISBN: 0470849061 Publisher: John Wiley & Sons Pub. Date: 28 May, 2003 List Price(USD): $90.00 |
Thank you for visiting www.AnyBook4Less.com and enjoy your savings!
Copyright� 2001-2021 Send your comments