How Earth Works
Overview
The Earth system consists of three components:
- The Earth daemon, earthd, running on each computer whose file system is to be indexed.
- A database storing the index
- A web application allowing to browse and query the index using a Web browser
This setup is illustrated in the diagram to the right.
A: Indexing Local File Systems
The Earth daemon runs on each host you want to index, continuously scanning a configured set of directories for updates.
B: Index Storage and Presentation
A single SQL database is used to store the index generated by all the Earth daemons. That SQL database is queried by the Earth Web application in response to client requests, providing various views of the index information.
C: Accessing the Information Using a Web Browser
People across your organization can view and query the index just by pointing their Web browser to the URL of the Earth Web application.
Recommended Setup
Running Earth On a Single Machine
There is no requirement to run the individual components of Earth on different computers. It is fine to run the Earth daemon, the database, the Web application, and the Web browser all on the same host. In fact, this is the setup we use for developing and testing Earth.
Running Earth In a Large Organization
However, Earth's primary purpose is to index very large file systems (in the Terabyte range.) In this situation, it is advisable to run the database on a dedicated, fast system since it will be under high load whenever large amounts of files or directories are created or deleted on an indexed file system.
You should also avoid using the Earth daemon to scan remote file systems. Instead, run the daemon on each host which has local disks you want to index. Usually this means running it on all your file servers, but you might also want to consider indexing workstations or other servers.
Attachments
- Earth Network Diagram 1.png (61.7 kB) -
Diagram showing the Earth network topology
, added by julians on 03/18/07 15:33:11.
