# Disable crawler (true to completely disable all crawling related code)
DISABLE_CRAWLER=false
Meaning : This determines if this server will perform the crawling duties. Currently only one server doing the crawling is supported.
# Disable search (true to completely disable all search related code)
DISABLE_SEARCH=false
Meaning: This server will perform searches on the database. Any webNetwork Server that has relays connected to it should have this enabled so that search functions.
# Interval for how often to ask for users that need to be crawled.
# See information at beginning of file about format of Duration/Interval properties.
GETDATA_PING_USER_INTERVAL=5 hours
Meaning : Every 5 hours (or right after startup), search asks webStorage for a list of users that use webStorage. This list is put into PWSUser. If the user already exists in there, he is not re-added .... the idea is that PWSUser keeps a "first in first out (based on priority)" list of users. So we start the server, we populate that table... they'll get fed into the crawler and get removed from that table when they are done... then 5 hours later, they'll get put back into that table again to be considered for another crawl
# Interval for how often to feed users that need to be crawled into the mount point manager.
# See information at beginning of file about format of Duration/Interval properties.
GETDATA_PING_MOUNT_POINT_INTERVAL=30 seconds
Meaning : Every 30 seconds... search looks into the PWSUser table to see if there are any users with a status of waiting ... if so it will pass the first x (based on CRAWL_USER_COUNT) users to some crawl workers (assuming some are available ... you only have 3 index workers)
when a user is picked up, in PWSUser, his status changes to scheduling ... when a tree worker starts to crawl him, the status changes to crawling. When the crawl is done, the user is deleted from the PWSUser table
# Minimum time between crawls for the same user. The user will not be allowed
# to be fed back into the scheduler by the crawler unless at least this much time has
# elapsed since the last crawl for that user.
# See information at beginning of file about format of Duration/Interval properties.
MINIMUM_NEXT_USER_CRAWL_DURATION=24 hours
Meaning : No matter how often the above settings put a user into the PWSUser table ... if it hasn't been 24 hours since the last crawl STARTED for that user ... he will not be crawled. You will not see debug for a worker for that user because he is never sent to a worker. Look in the PWSLastCrawl table and you can see when they were last crawled. Delete the data in that PWSLastCrawl table and it will assume that next time that the user shows up in the list, he will be crawled.