334 Commits

Author SHA1 Message Date
Christian Risi
892f91aad7 Fixes for evaluation 2025-10-16 19:20:23 +02:00
Christian Risi
9ff117f437 Adding best model 2025-10-16 19:20:09 +02:00
Christian Risi
83693f1d4e Fixed a patience bug and added label smoothing 2025-10-14 11:03:15 +02:00
Christian Risi
0b256001fe Changed position to reflect other datasets 2025-10-14 10:51:11 +02:00
Christian Risi
7585f556f8 Added more logging 2025-10-14 10:41:28 +02:00
Christian Risi
4968d79403 Fixed a masking problem 2025-10-14 10:34:14 +02:00
GassiGiuseppe
80fd7fd600 evaluator WIP 2025-10-12 22:59:07 +02:00
GassiGiuseppe
972a73758d added holdout for curated dataset 2025-10-12 19:06:09 +02:00
GassiGiuseppe
b38a011105 added curated dataset, which is 8000 2025-10-12 19:01:28 +02:00
GassiGiuseppe
2bdcd78622 Merge branch 'dev.train' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev.train 2025-10-12 18:20:04 +02:00
GassiGiuseppe
7dedbc481b evaluator WIP 2025-10-12 18:18:20 +02:00
Christian Risi
76345f8d4f Fixed a visual bug 2025-10-12 16:42:59 +02:00
GassiGiuseppe
2ccec9efb8 typo 2025-10-12 16:41:06 +02:00
GassiGiuseppe
e2231eb3b9 Merge branch 'dev.train' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev.train 2025-10-12 16:36:09 +02:00
GassiGiuseppe
144f8724d6 Update of the batcher to resolve a bug in the 4th construction 2025-10-12 16:35:42 +02:00
Christian Risi
07130ff489 Fixed several bugs for task 4 2025-10-12 16:30:30 +02:00
Christian Risi
e0f8a36aa5 Added support for fast resuming 2025-10-12 13:53:07 +02:00
Christian Risi
37a2501a79 Added a way to load checkpoints 2025-10-12 12:28:24 +02:00
Christian Risi
4ca1d0a189 Activated Dropout to avoid overfitting 2025-10-12 12:28:06 +02:00
Christian Risi
f463f699cf Fixed a bug over task 4 2025-10-12 12:22:38 +02:00
Christian Risi
ab3d68bc13 fixed patience not quitting 2025-10-12 01:41:34 +02:00
Christian Risi
79438e3d30 Fixed Patience system 2025-10-12 01:22:06 +02:00
Christian Risi
f98f5a2611 Fixed misprint in task 3 2025-10-12 01:16:09 +02:00
Christian Risi
4281f8724b Fixed Validation loss 2025-10-12 00:57:24 +02:00
Christian Risi
71d602e36e Fixed a memory bug 2025-10-12 00:47:20 +02:00
Christian Risi
46ee6055ec Added Colab default values 2025-10-12 00:15:54 +02:00
Christian Risi
e579e1c88b fixed verbosity 2025-10-12 00:15:15 +02:00
Christian Risi
f51ada866f Added verbosity level 2025-10-12 00:13:03 +02:00
Christian Risi
acd978cbc5 Merge branch 'dev.train' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev.train 2025-10-12 00:05:36 +02:00
Christian Risi
56fbadd55e Fixed training 2025-10-12 00:05:30 +02:00
GassiGiuseppe
14f1c574e7 typo batch size 2025-10-11 22:11:53 +02:00
Christian Risi
d8e65bfb8a Fixed a bug about mismatched batch sizes 2025-10-11 22:09:46 +02:00
Christian Risi
bcc2fe7368 Fixed bugs and added visibility 2025-10-11 21:49:29 +02:00
Christian Risi
160b7dbfc0 V0.0.1 Athene 2025-10-11 19:35:43 +02:00
GassiGiuseppe
49946727d8 updated decoder_input to work without embedder 2025-10-11 16:53:36 +02:00
GassiGiuseppe
1649cd7768 added decoder_input method to build the batch tensor to give in input to the deocder 2025-10-11 16:18:43 +02:00
GassiGiuseppe
443f54fffd WIP decoder with prefix mask 2025-10-11 15:31:43 +02:00
GassiGiuseppe
ff721107b9 typo 2025-10-11 15:26:58 +02:00
GassiGiuseppe
f1886e5be1 added builder for prefix mask 2025-10-11 15:19:09 +02:00
GassiGiuseppe
5e3878ea17 Merge branch 'dev' into dev.train 2025-10-11 11:51:58 +02:00
GassiGiuseppe
79d3fb9ff8 Merge branch 'dev' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev 2025-10-11 11:51:19 +02:00
GassiGiuseppe
586f021276 Merge branch 'dev.train' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev.train 2025-10-11 11:28:35 +02:00
GassiGiuseppe
82462078f8 WIP for the new prefix mask 2025-10-11 11:28:15 +02:00
Christian Risi
625f79f7c3 Fixed imports 2025-10-11 11:18:44 +02:00
GassiGiuseppe
3446870291 typo 2025-10-10 22:27:01 +02:00
GassiGiuseppe
e76dbeb9a7 typo 2025-10-10 22:26:06 +02:00
GassiGiuseppe
96610612fe Batcher added 2025-10-10 20:10:08 +02:00
Christian Risi
92ae40013d Added a way to detach models and create them standalone 2025-10-10 18:43:20 +02:00
Christian Risi
15f203cad5 Added boe 16k tokens vocabulary 2025-10-10 18:43:02 +02:00
Christian Risi
31c8541dfb Co-authored-by: GassiGiuseppe <GassiGiuseppe@users.noreply.github.com> 2025-10-10 16:28:09 +02:00