The previous Search Engine was not able to work under real time situation and able to fetch information which are only saved under their database.
So there was no mechanism for building the index file and keeping information as per the user search query. Most of time, users were not able to get their desired information and this section keeping users at their limit. Users were only able to get information in the form of web page which are linked to others. Existing system was not able to handle and store files and information using proper keywords and tags in the form of pdf, doc or in the format of zip file.
Some special mechanism has been used to work on the concept of web crawling which can be integrated with this Search Engine project. As to work under real time situation, it will enable the system, to go through the various web pages by using the keyword and prepare the list of it based on the search pattern. It sets up a search panel with a text box to enter a query and a submit button to run the query. See the source file for more details.
Notice the private helper class SimpleSearch. This is where the search is set up and executed and we will look at it in more detail here. It then calls the SimpleSearch. The SimpleSearch constructor takes the location of the search server as an argument. In this way, a single SimpleSearch object can be used repeatedly to run searches against the remote search server.
The SimpleSearch. Before submitting the search, SimpleSearch needs to create a Search object. The constructor for the Search class takes several arguments as discussed previously. The search scope is the actual query run by the search server. It is the scope argument to doSearch and ultimately derives from either the applet input panel or a command line argument to the main program.
The requested attribute set shown here will result in the server returning the score, url, title, and description of all documents that match the query. A comma delimited list of the attributes to be used to sort the results. A minus sign indicates descending order, a plus sign indicates ascending order. In this case, sort the results by decreasing numerical score value, and use alphabetical order of the title as the secondary sort order.
Not used in this case, implying anonymous search. The search is executed again for each page of results. It is possible to cache the results for all pages with a single search of course, but it is often easier to simply resubmit the search each time. This is equivalent to a user clicking a next button in a search user interface. The results are stored in the Search object. Now do something with the results.
They each show a different way of extracting the results from the Search object. The example application displays the query results to standard output or to a named file.
In reality, you would do more with the results than just print them like this, but once you know how to get the results out of the Search object, it is up to you what you do with them.
You can use standard Java functionality to process the results in any way you like. Each result is read from this SOIF stream in turn.
Note that the client server connection uses an efficient streamed protocol; it is conceivable that the server is still returning later results while the client is processing the first results. Now, retrieve each search hit from the result stream as SOIF objects and print its URL, title, description, and score to the output stream either the Java console, standard output, or a named output file.
URL is special and has. The program rdmgr is used to add data to the database from the command line. This section describes how to create input data for rdmgr so that it can be added to the database.
The rdmgr utility can add new data as well as replace, modify, or retrieve existing data. Note that SOIF also supports binary-valued fields and they can be added or retrieved too.
In the simplest case, rdmgr can be used to add a file containing multiple SOIF objects to the database. The search robot calls rdmgr in this manner to index data it collects from its crawling runs.
In the general case though, rdmgr accepts a complete resource description submit request as input. The com. Here is an example of constructing a request that can be used as a second argument to rdmgr :. Write the header part of the RDM to send to the database.
An update request to the search engine has the following attribute-value pairs:. Now we create the body part of the submit request. Now, the request is saved to a file for input to rdmgr :.
When this input is processed by rdmgr , it will result in the RD shown being added to the database and indexed. The rdmgr utility supports other types of requests too:. Retrieves the requested fields the submit view for the requested RDs. The server will return the requested fields for these RDs. March 26, at am. Star Apple says:. March 16, at am.
December 23, at am. Alex says:. November 30, at pm. April 17, at am. De Saha says:. March 9, at am. February 10, at am. Rajinder says:. March 4, at pm. Shama says:. April 5, at pm. June 17, at am. Santosh says:. Ps17 says:. September 29, at am. March 4, at am. Vijay says:.
July 6, at am. April 30, at am.
0コメント