Mini Project 20242 en Rev1.1
Mini Project 20242 en Rev1.1
Mini project
Illuminance sensor simulation and data analaysis
Description:
Students need to write a program in C or C++ with the appropriate functions and data structures to simulate
illuminance sensors which measure the illuminance of indoor spaces. Main sensor specifications are:
Measurement range: 0.1 ÷ 100000 𝐿𝑢𝑥
Resolution: 0.01 𝐿𝑢𝑥
The tasks to do include:
1. Task 1:
Write a program which allow the user to provide the number of sensors, sampling time and measurement
duration by using command-line statement to generate simulation data. The format of the command-line
statement is as the folllowings:
C:\\lux_sim –n [num_sensors] –s [sampling] –i [interval]
Where:
lux_sim: is the name of the compiled program file.
-n [num_sensors] are a pair of input arguments to provide the number of sensors,
[num_sensors] must be replaced by a specific positive integer value. If only one of these two
appears in the command-line statement, error message must be delivered. If both of these two are not
given in the statement, the default number of sensor is used and it is 1 (one).
-s [sampling] are a pair of input arguments to provide the sampling time, [sampling] must
be replaced by a specific positive integer value in second, the smallest sampling time allowed is 1
second. If only one of these two appears in the command-line statement, error message must be
delivered. If both of these two are not given in the statement, the default sampling time is used and
it is 60 seconds.
-i [interval] are a pair of input arguments to provide the simulation/measurement duration,
[interval] must be replaced by a specific positive integer value in hour, the smallest duration
allowed is 1 hour. If only one of these two appears in the command-line statement, error message
must be delivered. If both of these two are not given in the statement, the default duration is used and
it is 24 hours.
The simulation data generated include the sensor identification number (sensor id), (simulated) measurement
timestamp and (simulated) sensor value. The starting time of the simulatiuon, i.e. the fist timestamp, is
identified by subtracting the current time in the system (computer time when the program is executed) with
simulation duration. The timezone should be default, it is usually the local time; student do not need to use any
function or argument to change the timezone.
The sensor id is generated in the range of 1 to num_sensors, where num_sensors is the
number of sensor that the user provided in the command-line statement, e.g. if num_sensors =
10 the program will create 10 sensors with the ids are 1, 2, 3,… , 10.
The measurement timestamp (simulated) needs to be written in the format of YYYY:MM:DD
hh:mm:ss, where:
o YYYY – year, MM – month, DD – day.
o hh – hour, mm – minute, ss – second.
E.g: 2025:04:01 08:30:02
The measurement value (simulated) is generated randomly with the precision of 2 digits after
decimal point.
Notice: the simulation time (duration and timestamp) is not real-time, it is just computed in simulation, thus,
the student do not use time delay functions such as sleep() or loop to create time delay.
The data generated needs to be stored in a file named “lux_sensor.csv”; if the file exists, the program can
override the old file. This data file follows the CSV (comma-separated values) format, each field is separated
1
by a comma. CSV file format can be referred in the url: https://www.ietf.org/rfc/rfc4180.txt. The data file
needs to be in the same folder as the program file.
id,location
1,2
2,3
3,4
4,10
5,2
…
9,5
Where the id column contains all the sensor id available in [data_filename.csv] and location is the
location code as described in Table 1. If the sensor in data file is not included in the location file, it’s location
code is 0, e.g. data file has 9 sensors with ids from 1 to 9 but the location file has only 7 sensor ids 1, 2, 5, 6,
7, 8, 9 with proper location information and sensor 3 and 4 are missing or have no information of their locations,
the location code of sensor 3 and 4 are 0 (unknown location).
2
Table 1. Type of location with related activities and the required lux level
a. Task 2.1:
It is assumed that the lux level in the monitored environment can only be in the range of 1÷ 30000 𝑙𝑢𝑥. Thus,
the valid sensor values must be within this range, and any values which are not in this range are outliers. The
program needs to check for the invalid sensor values, i.e. outliers in the [data_filename.csv] file and
stored in a csv file named “lux_outlier.csv”. An example of this file content is shown below. If “lux_outlier.csv”
is already existing, it should be overwritten with the new one. The first line is written as “number of outliers:
X” with X is the number of outliers filtered out from the data file, in this example X = 3.
number of outliers: 3
id,time,value
1,2025:04:11 00:00:00,0.08
3,2025:04:11 19:03:00,-1.01
3,2025:04:11 21:06:00,45690.22
The valid data values are also stored in “lux_valid.csv” with the same format as the input data file as below.
Only the valid sensor values are used to perform the rest of the tasks from task 2.2 onwards.
3
id,time,value
1,2025:04:08 00:00:00, 50.01
2,2025:04:08 00:00:00,24.02
3,2025:04:08 00:00:00, 200.05
1,2025:04:08 00:01:00,100.12
2,2025:04:08 00:01:00,55.34
3,2025:04:08 00:01:00,160.49
…
1,2025:04:08 10:00:00,120.52
2,2025:04:08 10:00:00,90.40
3,2025:04:08 10:00:00,351.00
b. Tasks 2.2:
The lux level can be used to identify the light condition in the monitored area as below:
If lux value < Lux_min: dark
If lux value > Lux_max: bright
If Lux_min ≤ lux value ≤ Lux_max: good
Where Lux_min and Lux_max are given in table 1 for each type of location/area.
If the location is unknown, the condition should be “NA” (not available)
The program needs to calculate the average lux level per hour, e.g. average lux level at 2025:04:11
02:00:00 is the average value of all the lux values from 2025:04:11 01:00:00 to 2025:04:11 01:59:59. It
also identifies the lux condition with respect to that average value. The resusts should be store in a file
named “lux_condition.csv” with the same format as the below example:
id,time,location,value,condition
1,2025:04:11 01:00:00,1,30.51,good
2,2025:04:11 01:00:00,2,400.03,bright
3,2025:04:11 01:00:00,10,200.00,dark
4,2025:04:11 01:00:00,0,240.00,NA
1,2025:04:11 02:00:00,1,101.02,bright
2,2025:04:11 02:00:00,2,75.02,good
3,2025:04:11 02:00:00,10,812.05,good
4,2025:04:11 02:00:00,0,2.05,NA
…
c. Tasks 2.3:
Identify the maximum (max), minimum (min) and the average lux values over all the time (mean)
measured by each sensor. The results must be stored in a file named “lux_summary.csv” with the same
format as the below example:
id,parameter,time,value
1,max,2025:04:11 08:30:00,350.80
1,min,2025:04:11 09:31:03,5.61
1,mean,10:00:00 ,200.54
2,max,2025:04:11 08:35:00,300.83
2,min, 2025:04:11 09:32:03,15.61
2,mean,10:00:00, 110.55
3,max, 2025:04:11 09:05:02,120.67
3,min, 2025:04:11 09:21:03,20.81
3,mean,10:00:00,70.59
…
The time of the max and min values are the earliest timestamps these values appear in the input file. The
time of the mean value is the simulation time interval.
4
3. Task 3:
It is assumed that the users need to send and receive the output data of task 2.2 over a communication
protocol. The data packet transferred is a byte array which must follow the below structure
Table 2. Data packet frame
Where:
Start byte (1 byte) is the first byte in the packet and always has the value of 0xA0.
Stop byte (1 byte) is the last byte in the packet and always has the value of 0xA9.
Packet length is the size of the packet including the start byte and stop byte.
Id is the identification number of the sensor (sensor ID) and must be a positive value (>0)
Location is the location code as described in Table 1 and is a 1-byte integer.
Time is the measurement timestamp in second which follow Unix timestamp format.
Lux is the lux value which is a 4-byte real number represented with IEEE 754 single precision
floating-point standard.
Condition is the lux condition with respected to the lux value as identified in task 2.2 but converted
into a 1-bye integer as below:
o NA: 0
o dark: 1
o good: 2
o bright: 3
Checksum is the byte to verify the data packet and is calculated by using two complement algorithm
of the byte group including [packet length, id, time, location, lux]
All the numbers (integer and real ones) are represented as big-endian.
5
If the input file is a text file with the extension of “.dat” which contain the data packet as described in Table 2
and each packet is written in one line, the output file should be a csv file which must have the same format as
the one described in task 2.2.
For example:
C:\\lux_comm hex_packet_ee3491.dat lux_condition.csv
The programm should:
Read each line of the input file, i.e.: hex_packet_ee3491.dat
Convert each data packet to proper data fields and one field is separated from the other by a comma
as the file in task 2.2, e.g.:
A line of “0E 0F 03 0A 67 F2 26 70 43 60 35 C3 01 59 FE” exited in the file named
“hex_packet_ee3491.dat” is converted into
3,2025:04:06 14:00:00,10,224.21,dark
Write each converted data in one line in the output file, i.e. lux_condition.csv
Overide the output file lux_condition.csv if it has been existed.
Be able to process at least 10000 data ponts which means that the input file hex_packet_ee3491.dat
may consist of at least 10000 lines.
6
location file has only 7 sensor ids 1, 2, 5, 6, 7, 8, 9 or location code in location.csv is blank,
e.g.: “3,”. The error message can be “Error 05: unknown location of sensor ID” where ID is the id of
the sensor which does not presented in location file.
If the input file contains duplicated data, i.e. two or more lines in the file are exactly the same or have
the same id and time, the error message must include the line numbers in the input file in which the
error happens, i.e.: “Error 06: data at line X and Y are duplicated” where X is the line number where
the data first appears and Y is the line number where data is the same as X. The program should
process the line X as normal and ignore line Y.
The output file is existing and is a read-only file. The error message can be: “Error 07: cannot override
output file”.
If the input file has more than 10000 data lines, the students can choose one of the two approaches:
o Process all the data
o Process only the first 10000 data lines and ignore the rest. In that case, the program should
write the error message “Error 08: input file is too large” in the log file.
7
4. Other errors:
The students can suggest more errors which may happen but not be listed above. Those errors should be stored
in the respective log file and described in the report.
Program design:
The students must use top-down approach to design the program. It is required to draw the top-down diagrams
to illustrate the relationship between the functions in the program for each task together with brief description
in the report why you organize your project in that way.
The students need to draw at least one flowchart for one important function in each task. The student can draw
more than one flowchart for different functions in each task.
The students must provide the folder structure and file structures as the followings with brief description:
groupID_ mini_project_20242\
task_1\
files of task 1
task_2\
files of task 2
task_3\
files of task 3
report.docx
Where:
The project root folder is named as “groupID _ mini_project_20242” and three subfolders “task_1”,
“task_2” and “task_3” which contain the source code and header files, and do not include compiled
files, generated data files and other temperorary files. For examples, files of task 1 are all the source
codes and header files of task 1.
Do NOT use any other sub-folders rather than 3 above ones.
The source code files must be named with the extension of .c or .cpp, do not use other extensions
like .cxx. The header files must be named with the extension of .h, do NOT use other extensions like
.hpp.
“report.docx” is the report file and should be placed in the project root folder.
The groupID in the folder name should be replaced by the group ID.
Coding styles:
Coding style needs to be consistent throughout the program and follows the GNU style described in the
following link: https://www.gnu.org/prep/standards/html_node/Writing-C.html
Shortly, it should be as the followings:
The structure of the code should be clean and easy to read by using proper indentation, parenthesis,
code block, line break and spacing.
Comments are provided to explain the program more clearly but not to paraphrase the code statement.
The names of functions and variables must be in English, compact and self-described.
Hard-coding should be avoided.
Notice: The students must NOT use any third-party libraries rather than the C/C++ standard library. The
algorithm.h and linalg.h in the standard library are NOT allowed either.
The tools:
Editor: Visual studio code (https://code.visualstudio.com/download)
Compiler: gcc or g++ in MinGW-w64 which can be downloaded here: https://github.com/niXman/mingw-
builds-binaries/releases/download/14.2.0-rt_v12-rev2/x86_64-14.2.0-release-posix-seh-ucrt-rt_v12-rev2.7z
If the students write the programs in Linux or MacOS, you must use the equivalent version of C/C++ and make
sure that your programs can be compiled in Windows.
AI tools like ChatGPT or Copilot are NOT allowed.
8
Report and submission guidlines:
The students do the mini project in a group.
The whole project needs to be organized in a folder as describe in section III.
The studens must write an English report in a Word file named as “report.docx” which should not
exceed 6 A4 pages and should not include the source code. The report must follow IEEE template
which is attached in the Team Assignment.
The content of the report must be:
o Introduction: Brief description of the program design idea including
The diagrame of folder structure and the source code files (if there are multiple source
code files in each task).
the standard libraries used in each task,
and the top-down approach diagrams.
Do NOT rewrite this mini project description in the report.
o Detailed design:
introduction of the crucial key data structures defined by students to handle the data (if
any),
design description of a few important functions (selected some key functions, do not
need to list every ones) including the function call syntax (function name, argument list
and returning value) and inputs/outputs, pre-conditions, post-conditions;
at least one flowchart of the most important function in each task as mentioned in section
III.
o Results and evaluations: summarising the results of the program execution and its
performance.
o Conclusions:
briefly conclude what have been done and what have NOT been done;
provide a table of group member contribution as the below example:
Student names Tasks Percentage of contribution
Nguyen Van A 1, 2.1, 2.2 50%
Nguyen Van B 2.3, 2.4, 3 50%
If only one student completes all the work and the other do nothing, one can write the
percentage of contribution as 100% and 0% respectively.
o References (If any).
The students must compress the whole project folder in a zip file and name it as
“groupId_mini_project_20242.zip” for submission. Please take note that it must be a ZIP file but
NOT any other compression files like .rar.
o Keep only the source codes (.c/.cpp), header (.h) and the Word (.docx) report files. Do not
submit the .csv, .dat, .exe files or the report file in .pdf.
o Unrelated files should be removed before submitting.
o The “groupID” in the file name must be replace by the group ID, e.g.:
“20221234_20222345_mini_project_20242.zip”
Students must submit the zip file above in Team Assignment by the deadline specified there. Do
NOT submit via email, Teams chat or any other channels. Only ONE submission per group is
required.
If there are concerns on anything else which is NOT mentioned in this project instruction, the students
can contact the lecturer for clarification. It is highly recommended to ask the questions in class Teams
rather than private discussion with the lecturer so that every student in the class is informed unless
the question is too personal.
Evaluations:
The mini project is evaluated as below:
o All the tasks are completed, no run-time error, proper error handling and creative
implementation (60%)
o Good coding style (20%)
o Clear and well-structure report properly following template and highly consistent with the
source code which can show a clean and reusable design (40%)
9
o Improper naming and structure of submission files and folder as specified in Secion III (-0.5
bonus point if it is given).
The students must do the project themselves. Do NOT copy others’ works. The group should keep
their work confidential. If any two or more groups have the similar source codes and/or
reports, all the works of those groups are unacceptable and are considered that they do NOT
submit the project. It does not matter who copies from whom.
The student with too little contribution in the group work (<20%) is also considered as NOT
submitting project. The student in the group with higher contribution may get a better bonus
point.
No late submission is allowed. Students should try to submit whatever you have completed in time;
eventhough, you may not have completed all the requirements in this mini project.
Good performance in the mini project can be awarded bonus points in course progress evaluation.
Maximum bonus point may be given only if the students can complete all three tasks which can pass
a certain number of test cases and have a proper report.
Oral questions may be asked to verify the students’ work for giving bonus points.
The group who does not submit the mini project will lose 3 point (over 10) in the course
progress evaluation.
------------------------ END ------------------------------
10