Practicum: Checkpoint 1

CSE-4/562 Spring 2019

February 8, 2019

Textbook: Ch. 16.1

Checkpoint 1

Your code compiled just the same as in Checkpoint 0.
Print a prompt '$>' at the start and after each command.
CREATE TABLE statements tell you the schema of each table.
Data lives in a '|'-separated file in 'data/[tablename].csv'
Print query results '|'-separated

CREATE TABLE PLAYERS(
ID string, 
FIRSTNAME string, 
LASTNAME string, 
FIRSTSEASON int, 
LASTSEASON int, 
WEIGHT int, 
BIRTHDATE date
);

SELECT FIRSTNAME, LASTNAME, 
   WEIGHT, BIRTHDATE 
FROM PLAYERS 
WHERE WEIGHT>200;

CREATE TABLE PLAYERS(
ID string, 
FIRSTNAME string, 
LASTNAME string, 
FIRSTSEASON int, 
LASTSEASON int, 
WEIGHT int, 
BIRTHDATE date
);

↓

There is a table named "PLAYERS"...

... with 7 attributes
... who's attributes have the given types
... with data in the file "data/PLAYERS.csv"

ABDELAL01|Alaa|Abdelnaby|1990|1994|240|1968-06-24
ABDULKA01|Kareem|Abdul-jabbar|1969|1988|225|1947-04-16
ABDULMA01|Mahmo|Abdul-rauf|1990|2000|162|1969-03-09
ABDULTA01|Tariq|Abdul-wahad|1997|2002|223|1974-11-03
ABDURSH01|Shareef|Abdur-rahim|1996|2007|225|1976-12-11
ABERNTO01|Tom|Abernethy|1976|1980|220|1954-05-06
ABRAMJO01|John|Abramovic|1946|1947|195|1919-02-09
ACKERAL01|Alex|Acker|2005|2008|185|1983-01-21
ACKERDO01|Donald|Ackerman|1953|1953|183|1930-09-04
ACRESMA01|Mark|Acres|1987|1992|220|1962-11-15
ACTONCH01|Charles|Acton|1967|1967|210|1942-01-11
...

Example Queries

SELECT A, B, ... FROM R (Project)
SELECT A, B, ... FROM R WHERE ... (Project+Filter)
SELECT A+B AS C, ... FROM R (Map)
SELECT A+B AS C, ... FROM R WHERE ... (Map+Filter)
SELECT SUM(A+B) AS C, ... FROM R (Aggregate)
SELECT SUM(A+B) AS C, ... FROM R WHERE ... (Aggregate+Filter)

    if(stmt instanceof Select){
      SelectBody bodyBase = ((Select)stmt).getSelectBody();
      if(bodyBase instanceof PlainSelect){
        PlainSelect body = (PlainSelect)bodyBase;
        ...
        body.getFromItem()
        body.getWhere()
        body.getSelectItems()
        ...
      }
    }

Iterators

There are a number of data transformations that appear in more than one pattern. For example:

Loading the CSV file in as data
Filtering rows out of data
Transforming (mapping) data into a new structure
Summarizing (aggregating) data
Printing output data

Suggestion: Abstract these steps out

Idea: Functions

loadCSV(...) returns a table
filter(condition, table) returns a table
map(expressions, table) returns a table
aggregate(aggregates, table) returns a row
print(table)

Basic Mindset


  r = get_table("R")

  s = get_table("S")
  
  temp1 = apply_join(r, s, "R.B = S.B")
  
  temp2 = apply_select(temp1, "S.C = 10")
  
  result = apply_projection(temp2, "R.A")

Select

$$\sigma_{A \neq 3} R$$

A	B
1	2
3	4
5	6

Select


                  def apply_select(input, condition)
                    result = []

                    for row in input:
                      if condition(row):
                        result += [row]

                    return result;

(All-At-Once)

Problem: A "table" can get very very big.

Better Idea: Iterators

hasNext(): Returns true if there are more rows to return
next(): Returns the next row
reset(): Resets the iterator back to the first row

All "functions" can be implemented as iterators that use constant space

Select

$$\sigma_{A \neq 3} R$$

A	B
`getNext()`		`for row in input:`
1	2	`return row;`
`getNext()`		`for row in input:`
3	4	X
5	6	`return row;`
`getNext()`		`for row in input:`
`None`		`return None;`

Example: Join (Naive)


			for r in R:
			  for s in S:
			    emit(merge(r, s))

Example: Join (Naive)


class NestedLoopJoinIterator implements RAIterator
{
  RAIterator R, S;
  Row current_r = null;
  public NestedLoopJoinIterator(RAIterator R, RAIterator S) {
    this.child = child; this.condition = condition; 
  }
  public Row next(){ 
  	if(!S.hasNext()) { S.reset(); current_r = null; }
    if(current_r == null){ current_r = R.next(); }
    return merge(S.next(), current_r);
  }
  
  /* ...? */
}

'|'-separated Value File Suggestions

Use BufferedReader's readline() method
Precompile your '|' splitter using java.util.regexp.Pattern.compile()
Parse everything upfront
Keep each row as an Array of PrimitiveValues
For Codd's sake, don't store entire tables in memory

Also, Use EvalLib's PrimitiveValue

SQL Type	PrimitiveValue
string	StringValue
varchar	StringValue
char	StringValue
int	LongValue
decimal	DoubleValue
date	DateValue

EvalLib

How do you evaluate the A > 5 in
SELECT B FROM R WHERE A > 5?

More generally, how do you evaluate a JSqlParser Expression?


    Eval eval = new Eval(){ /* we'll get what goes here shortly */ }


    // Evaluate "1 + 2.0"
    PrimitiveValue result;
    result = 
    eval.eval(
      new Addition(
        new LongPrimitive(1),
        new DoublePrimitive(2.0)
      )
    ); 
    System.out.println("Result: "+result); // "Result: 3.0"


    // Evaluate "R.A >= 5"
    result =
    eval.eval(
      new GreaterThanEquals(
        new Column(new Table(null, "R"), "A"), // `R.A`
        new LongPrimitive(5)
      )
    );

Problem: What value should EvalLib give to R.A?

Your code needs to tell it...


  Eval eval = new Eval(){
    public PrimitiveValue eval(Column c){ 
      String colName = c.getColumnName();
      String tableName = c.getTable().getName();
      PrimitiveValue ret = /* look up colName.tableName */
      return ret;
    }
  }

Building Iterators From JSQLParser Statements


      SELECT [DISTINCT]
             target
      FROM source
      WHERE cond
      ORDER BY order_by
      LIMIT lim
      UNION nextselect


  RAIterator result = new TableIterator(plainSelect.getFromItem());


  if(plainSelect.getJoins() != null){
   for(Join join : plainSelect.getJoins()){
      /** Make sure Join is a simple join **/
      result = new NestedLoopJoinIterator(result, join.getFromItem());
    }
  }


  if(plainSelect.getWhere() != null){
    result = new FilterIterator(plainSelect.getWhere(), result);
  }

Practicum: Checkpoint 1

CSE-4/562 Spring 2019

Textbook: Ch. 16.1

Checkpoint 1

Checkpoint 1

Example Queries

Iterators

Idea: Functions

Basic Mindset

Select

Select

Better Idea: Iterators

Select

Example: Join (Naive)

Example: Join (Naive)

'|'-separated Value File Suggestions

EvalLib

Building Iterators From JSQLParser Statements

Questions?