Show simple item record

dc.contributor.authorPotamias, Michalisen_US
dc.date.accessioned2011-10-20T04:51:08Z
dc.date.available2011-10-20T04:51:08Z
dc.date.issued2009en_US
dc.identifier.citationPotamias, Michalis. "Indexing Distances In Large Graphs And Applications In Search Tasks (MA Thesis)", Technical Report BUCS-TR-2008-029, Computer Science Department, Boston University, December 1, 2008. [Available from: http://hdl.handle.net/2144/1721]en_US
dc.identifier.urihttp://hdl.handle.net/2144/1721
dc.description.abstractThis thesis elaborates on the problem of preprocessing a large graph so that single-pair shortest-path queries can be answered quickly at runtime. Computing shortest paths is a well studied problem, but exact algorithms do not scale well to real-world huge graphs in applications that require very short response time. The focus is on approximate methods for distance estimation, in particular in landmarks-based distance indexing. This approach involves choosing some nodes as landmarks and computing (offline), for each node in the graph its embedding, i.e., the vector of its distances from all the landmarks. At runtime, when the distance between a pair of nodes is queried, it can be quickly estimated by combining the embeddings of the two nodes. Choosing optimal landmarks is shown to be hard and thus heuristic solutions are employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the techniques presented in this thesis is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach which considers selecting landmarks at random. Finally, they are applied in two important problems arising naturally in large-scale graphs, namely social search and community detection.en_US
dc.language.isoen_USen_US
dc.publisherBoston University Computer Science Departmenten_US
dc.relation.ispartofseriesBUCS Technical Reports;BUCS-TR-2008-029en_US
dc.titleIndexing Distances in Large Graphs and Applications in Search Tasksen_US
dc.typeTechnical Reporten_US
etd.degree.nameMaster of Arts
etd.degree.levelmasters
etd.degree.disciplineComputer Science
etd.degree.grantorBoston University


Files in this item

This item appears in the following Collection(s)

Show simple item record