Skip to content

lias-laboratory/pandasql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PandaSQL

This project concerns implementing and optimizing Randomized Triangle Enumeration Algorithm using SQL queries.

Requirements

Downloads

It contains the following elements:

  • CLI version (cli directory) which contains:
    • Vertica_codes: contains two python scripts:
      • Standard_TE.py concerns the triangle enumeration using standard algorithm query,
      • Randomized_TE concerns the triangle enumeration using randomized algorithm optimized queries.
    • Triplet: this folder contains color triplet files according to the size of the cluster (8,27,64,…)
  • GUI version (gui directory) with HTML page interface

Build and install

To use the script, please define your database connection statement in the script then use the following command line to execute it:

  • For standard algorithm query:
$ python Vertica_codes/Standard_TE.py path_to_your_dataset path_to_output_directory type[directed/undirected]
  • For Randomized algorithm query:
$ python Vertica_codes/Randomized_TE.py path_to_your_dataset triplet/triplet8.txt path_to_output_directory type[directed/undirected]

In the command line above, make sure to choose between directed or undirected without typing key word type (this should be according to the type of the chosen data set). Here an example:

$ python Vertica_codes/Standard_TE.py Datasets/Real/WikiTalk.txt Results_TE/ directed

To use the graphic interface, update the file config.py in PandaSQL directory with your database connection statement then use the following command line to execture it:

$ cd path/to/PandaSQL_GUI
$ python server.py

To use PandaSQL, open a browser and type: 127.0.0.1:5000

How to use

Please refer to this video for a live demonstration.

PandaSQL Demonstration

Results

The results output using the CLI version of PandaSQL by the standard algorithm are of the following format: (vertex1,vertex2,vertex3)

Example of output:

Vertex1 vertex2 vertex3
1 2 5
1 2 8
1 3 7
... ... ...
185 200 305

The results output using the CLI version of PandaSQL by the randomized algorithm are of the following format: (machine,vertex1,vertex2,vertex3)

Example of output:

machine vertex1 vertex2 vertex3
1 1 2 3
1 1 2 5
1 1 3 7
... ... ... ...
8 160 59 365

Publication

  • Abir Farouzi, Ladjel Bellatreche, Carlos Ordonez, Gopal Pandurangan, Mimoun Malki. A Scalable Randomized Algorithm for Triangle Enumeration on Graphs based on SQL Queries, DAWAK Conference 2020

Software license agreement

Details the license agreement of PandaSQL: LICENSE

Historic Contributors (core developers first followed by alphabetical order)

About

PandaSQL concerns implementing and optimizing Randomized Triangle Enumeration Algorithm using SQL queries.

Resources

License

Stars

Watchers

Forks