Compact Hilbert Indices (Summer 2006)
This was joint work with Dr. Andrew Rau-Chaplin's OLAP group. It investigated algorithms for efficiently calculating Hilbert curves and order-preserving representations of Hilbert curve indices that use the same amount of space as the original point representation. This is useful when using the Hilbert curve as a space filling curve through a high-dimensional space where not all dimensions have the same cardinality. This project has since been developed into a C++ library.
For more details, refer the page for this project.
Fault-Tolerant Parallel OLAP Data Warehouses (Fall 2005)
This is a course project for Dr. Andrew Rau-Chaplin's course on parallel computing (CSCI 6702). The project investigates various problems in the domain of OLAP warehousing, with the focus on tackling questions related to compression and fault-tolerance.
For more details, refer to the page for this project.
Stochastic Word-Alignment Through Matrix Factorization (Fall 2005)
This is a course project for Dr. Vlado Keselj's course on natural language processing (CSCI 6509). The problem investigates the problem of word-aligning an already sentenced aligned parallel text corpus. The approach is a stochastic approach, maximizing the net probability of alignment given a training corpus. Emphasis is on using fast matrix factorization techniques to reduce the search space.
For more details, refer to the page for this project.