Commit Graph

310 Commits

Author SHA1 Message Date
GassiGiuseppe
0970cabf92 reverse.csv grammar correction of the header
it seemed to have missplaced the header also in the middle of the csv
2025-09-22 13:47:20 +02:00
GassiGiuseppe
a26d92750f Update movie-pageid.csv : grammar correction of the header 2025-09-22 12:59:35 +02:00
GassiGiuseppe
34c4782232 Dataset.db update. it seems to be correct 2025-09-20 23:33:56 +02:00
GassiGiuseppe
c5439533e6 DataRetrivial update, without df 2025-09-20 23:32:08 +02:00
GassiGiuseppe
8819b8e87f DataRetrivial populate the db from csv 2025-09-20 19:56:24 +02:00
Christian Risi
1076dc8aa6 Run /Scripts/DataCleaning/SQL_Queries/db_creation.sql 2025-09-20 16:39:16 +02:00
Christian Risi
3d15e03b09 Renamed file to fix spelling 2025-09-20 16:38:38 +02:00
Christian Risi
0ee2ec6fcd Spelling corrections 2025-09-20 16:37:57 +02:00
Christian Risi
95cfa5486c Added instructions to create databse schema 2025-09-20 16:30:08 +02:00
GassiGiuseppe
0d30e90ee0 Created file for the db DatawareHouse
Also decided firsts schema models into DBMerger
2025-09-20 15:53:32 +02:00
GassiGiuseppe
faaba17a98 Merge branch 'dev' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev 2025-09-20 14:34:25 +02:00
Christian Risi
854e5f1d98 Updated file to gather data from wikipedia 2025-09-20 14:32:30 +02:00
GassiGiuseppe
242d7f674f wikipedia summary file uploaded
Dataset composed of PageId and wikipedia Summary
2025-09-20 14:32:25 +02:00
Christian Risi
de8c2afceb Added reconciliation 2025-09-19 22:22:09 +02:00
Christian Risi
f89dffff75 Created script to gather wikipedia abstracts 2025-09-19 19:01:38 +02:00
GassiGiuseppe
e39bad8348 Added Troubleshooting section to README
where are corrected some potential issue with git and big files
2025-09-19 13:39:56 +02:00
GassiGiuseppe
7a1a221017 update of the database of movie-pageid
which has subject has film uri and object wikipage id
2025-09-19 13:37:56 +02:00
Christian Risi
fafe6ae0f9 Modified tree structure with more TMP directories 2025-09-19 12:46:31 +02:00
Christian Risi
e32444df75 Updated fetchdata to be used in terminal
Changes:
  - now you can use it as if it were a cli command

Missing:
  - documentation
2025-09-19 12:35:15 +02:00
Christian Risi
b74b7ac4f0 Added new directories to make experiments and updated .gitignore
Changes:
  - Added /Scripts/Experiments/Queries to keep track
      of important queries, once set
  - Added /Scripts/Experiments/Tmp to run quick experiments
      when still unsure while explorating datasets
2025-09-19 08:43:54 +02:00
Christian Risi
22134391d9 Added Scripts/Experiment directory
This directory is to place files to make experiments
2025-09-19 08:41:46 +02:00
Christian Risi
82c9023849 Ignoring Scripts/Experiments files and always tracking .gitkeep files 2025-09-19 08:39:47 +02:00
Christian Risi
00b87e01ea Moved fetchdata.py to reflect working tree
old - ${Proj}/Scripts/fetchdata.py
new - ${Proj}/Scripts/DataGathering/fetchdata.py
2025-09-19 08:37:04 +02:00
Christian Risi
ce3d4bf6c5 Renamed dir from Script to Scripts 2025-09-19 08:31:00 +02:00
GassiGiuseppe
c415b175a0 added reverse.csv with the reletion incoming to films 2025-09-18 20:26:51 +02:00
GassiGiuseppe
ec81ea7930 Added file to gather wikipedia abstract from url 2025-09-18 20:26:11 +02:00
GassiGiuseppe
4bb03f86b3 Added file to study the most frequent relationship into a csv triplet 2025-09-18 20:25:25 +02:00
GassiGiuseppe
e5f201f3db DEVELOPMENT file makrdown created 2025-09-18 20:24:54 +02:00
GassiGiuseppe
1c715dc569 Typo correction in the markdown 2025-09-18 20:24:11 +02:00
GassiGiuseppe
6686b47328 Added SQL to obtain wikipedia url with movies 2025-09-18 20:23:10 +02:00
GassiGiuseppe
9a5a7d84fd Merge branch 'dev' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev 2025-09-18 19:20:26 +02:00
GassiGiuseppe
9678ece9c0 Requirements changed
added Pandas and some other
2025-09-18 19:07:38 +02:00
Christian Risi
67bcd732b5 Updated movies 2025-09-18 18:36:52 +02:00
Christian Risi
1a4f900500 Updated git attributes 2025-09-18 18:36:42 +02:00
Christian Risi
ca8729b67c Merge branch 'dev' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev 2025-09-18 18:36:30 +02:00
GassiGiuseppe
9dbffc52ed Added dataset of movies and their wikipedia's page link 2025-09-18 18:16:51 +02:00
Christian Risi
b7f504942a Created Dataset 2025-09-18 17:24:08 +02:00
Christian Risi
7f0c5ce8d3 Updated File for fetching 2025-09-18 17:23:56 +02:00
Christian Risi
9838e287a4 Updated file 2025-09-18 12:03:09 +02:00
Christian Risi
ca6143ea3c Updated Query histories 2025-09-18 11:46:32 +02:00
Christian Risi
16e7ab4d9f Modified Datasets 2025-09-17 17:30:51 +02:00
Christian Risi
28723ab662 Merge branch 'dev' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev 2025-09-17 17:06:16 +02:00
Christian Risi
3e59efcf33 Generated datasets 2025-09-17 17:06:14 +02:00
Christian Risi
7c04309cc1 Added script to fetch data from DBPedia 2025-09-17 17:05:27 +02:00
Christian Risi
db87295890 Added history of queries 2025-09-17 17:04:58 +02:00
GassiGiuseppe
61568200a8 README update with setup chapter
where are scripted the command to manage conda and pip
2025-09-17 16:50:50 +02:00
Christian Risi
8df2736b97 Added environments 2025-09-17 16:16:58 +02:00
Christian Risi
eb5b7f629a Conda env 2025-09-17 15:53:17 +02:00
Christian Risi
79232b391e First SparQL query 2025-09-17 14:26:37 +02:00
Christian Risi
72eb937b47 Fixed Markdown violations 2025-09-17 12:51:14 +02:00