Welcome to the Rho Parabix Repository
Working on csv2json Enhancement.
Contents
User Stories
-
As a data engineer I want to be able to specify which delimiters to use when coding, so that I can save data in different formats
-
As a CS student I have no time to study or do work so I need this program to help me finish my assignments quicker.
-
As a data engineer I want to make regex requirements for the inputs, so that I don’t have to waste precious time hunting down typos.
-
As a data engineer I deal with a lot of different types of data types, like integers, floats, and booleans. It would be very convenient if the input data was automatically formatted to their respective data types in the JSON file.
-
As a novice in CS I might forget to name a header field in my data set, I’d like the option to catch any simple errors and to fix them.
-
As a data analyst I want to be able to format my output data in specific ways so that it’s easier to read and analyze.
-
As a data analyst, I would like a quick way to replace a data value that I entered multiple times with another value.
TDD Examples
Scenario 1
Users can use semicolons to denote separate objects.
Scenario 2
The program automatically verifies that the number of header fields match the number of csv fields.
Scenario 3
Users can restrict the date format to [1-2][0-9]{3}-[0-1][0-9]-[0-3][0-9] to catch some typos.
Scenario 4
Instead of numbers from the csv file being converted into strings, numbers would be converted into integers.
Scenario 5
If there are more csv fields to header fields, give the option to name the extra csv field.
Scenario 6
The program can separate different objects into separate lines.
Scenario 7
the program will find occurrences of the original value, replace it with the new value given by the user, and return the number of times the value was replaced.
Executable Test Cases
Outputs are from the Terminal.
Scenario 1
scenario1.csv:
date;county;state;fips;cases;deaths
2020-01-21;Snohomish;Washington;53061;1;0
Command: ./csv2json scenario1.csv del “;”
Output:
{"date":"2020-01-21","county":"Snohomish","state":"Washington","fips":"53061","cases":"1","deaths","0"}
Scenario 2
scenario2.csv:
date,county,state,fips,cases,deaths
2020-01-21,Snohomish,Washington,53061,1,0
Command: ./csv2json scenario2.csv -h
Output:
Number of fields do not match
Scenario 3
scenario3.csv:
date,county,state,fips,cases,deaths
3020-21-91,Snohomish,Washington,53061,1,0
Command: ./csv2json scenario3.csv -c date:[1-2][0-9]{3}-[0-1][0-9]-[0-3][0-9]
Output:
Invalid “date” format
Scenario 4
scenario4.csv:
name, usernum, employed
“Bryan”, “1”, “true”
Command: ./csv2json scenario4.csv -f
Output:
[{“name”: “Bryan”, usernum: 1, employed: true}]
Scenario 5
scenario5.csv:
name, usernum, employed,
“Bryan”, “1”, “true”, “Simon Fraser University”
Command: ./csv2json scenario5.csv -hf
Output:
Data field is missing header name
Please name the data field:
Scenario 6
scenario6.csv:
date,county,state,fips,cases,deaths
2020-01-21,Snohomish,Washington,53061,1,0
Command: ./csv2json scenario6.csv -sl
Output:
"Date":"2020-01-21"
"county":"Snohomish"
"state":"Washington"
"Fips":"53061"
"cases":"1"
"deaths":"0"
Scenario 7
scenario7.csv:
date,name,favourite food,
2020-01-21,Mike,suop
Command: ./csv2json scenario7.csv -r [“suop” “soup”]
Output:
date,name,favourite food,
2020-01-21,Mike,soup
“suop” replaced with “soup” 1 time(s)