Skip to content
Snippets Groups Projects
Commit 8f193b26 authored by Pavel Veselý's avatar Pavel Veselý
Browse files

Merge branch 'master' of gitlab.kam.mff.cuni.cz:datovky/assignments

parents b75af5d6 d68e8596
No related branches found
No related tags found
No related merge requests found
...@@ -15,7 +15,7 @@ tests works only on Linux (and not on Windows), and of course also in ReCodEx. ...@@ -15,7 +15,7 @@ tests works only on Linux (and not on Windows), and of course also in ReCodEx.
You can use full standard library of Python and C++ in this assignment, You can use full standard library of Python and C++ in this assignment,
including data structure implementations (also, `bytearray` might come handy). including data structure implementations (also, `bytearray` might come handy).
Your solution must also work on other input data of the same size with similar Your solution must also work on other input data of the same size with similar
number of duplicates. Hence solutions depending on the fact that each string is number of duplicates. Hence, solutions depending on the fact that each string is
uniquely determined by some its substring or similar properties of the input uniquely determined by some its substring or similar properties of the input
will not be accepted. will not be accepted.
...@@ -25,3 +25,11 @@ Note that due to the space constraints of the Python solutions, tests `10M` and ...@@ -25,3 +25,11 @@ Note that due to the space constraints of the Python solutions, tests `10M` and
not used and are always considered successful by ReCodEx. not used and are always considered successful by ReCodEx.
Source code templates can be found in [git](https://gitlab.kam.mff.cuni.cz/datovky/assignments/-/tree/master). Source code templates can be found in [git](https://gitlab.kam.mff.cuni.cz/datovky/assignments/-/tree/master).
Hints:
* Array [ False ] * 2**20 requires approximately 8 MB since Python stores it as an array of pointers to one value False. Use bytearray instead.
* Read carefully the documentation of bytearray and distinguish the terms bit and byte.
* In Python, do not import numpy or other libraries consuming more memory to load than available.
* The memory limit prevents storing all keys, so trying trivial solutions which store all keys in a dictionary is a waste of time.
* Count the number of duplicates and candidates for duplicates. For properly implemented hashing, those two numbers should be very close.
* Use profilers to trace memory usage; see e.g. https://docs.python.org/3/library/tracemalloc.html or https://valgrind.org/.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment