55 lines
3.1 KiB
Markdown
55 lines
3.1 KiB
Markdown
|
# Scripting exercises - sorting files
|
||
|
|
||
|
Damn it, I made a mess of my files!
|
||
|
Can you sort the following picture collections for me please?
|
||
|
I don't really care what ordering system you use but thousands of files in one directory is **not** practical.
|
||
|
There are four different folders with files to sort and each one is a **separate** assignment so at the end I expect **four** folders with **sorted** files (and subdirectories).
|
||
|
|
||
|
The files are archived and can me downloaded [here](./assets/files.tar).
|
||
|
I advise you to keep a copy of the archive so you can run your scripts multiple times until you achieve a desirable outcome.
|
||
|
|
||
|
## The batches
|
||
|
|
||
|
### Simple filenames
|
||
|
|
||
|
The first batch of files has a very straight forward filename `FUJI_20120103_171310.jpg`.
|
||
|
The pictures span multiple years but are all from one single device (FUJI).
|
||
|
Sort however you want but `$YEAH/$MONTH` might be a good start.
|
||
|
|
||
|
### Multiple cameras and formats
|
||
|
|
||
|
The second batch has pictures form multiple cameras as well as multiple file extensions.
|
||
|
You can sort in multiple ways but for example `$YEAR/$CAMERA/$MONTH` or `$CAMERA/$YEAR/$MONTH`.
|
||
|
The choice is yours.
|
||
|
|
||
|
### Messy filenames
|
||
|
|
||
|
The third batch is very messy and has not only multiple cameras and formats but also multiple date structures.
|
||
|
This one will require some hefty debugging!
|
||
|
|
||
|
### Recovery files
|
||
|
|
||
|
The fourth batch is pretty messed up.
|
||
|
No dates or logic can be found in the filenames but luckily jpg files can contain **metadata** about the files.
|
||
|
This challenge will require you to search and install extra Python3 libraries to access this metadata.
|
||
|
Installing will be done via `pip3` which comes with Pycharm.
|
||
|
Have a look at this [library](https://github.com/TNThieding/exif) and figure out how to use it.
|
||
|
It might be handy to install `imagemagick` via `sudo apt install imagemagick`.
|
||
|
This gives you the ability to inspect metadata on the Linux command line via `identify -verbose BJtpWU7n7WCeOL2B84Vz.jpg`.
|
||
|
|
||
|
## Some hints and tips
|
||
|
|
||
|
While we *just* discovered the creation of our own objects in Python3 you don't *need* your own classes to complete these exercises.
|
||
|
I would advise to create multiple **functions** and use a lot of `print()` calls to help you make sense of your `for file in files:` loops.
|
||
|
You can slow down the loops with `time.sleep(1)` if the cycling feels to quick to you.
|
||
|
You can make one script for each batch, or reuse the same script but create different functions for the four batches, whatever is easiest for you.
|
||
|
Ask for **help** from your classmates when you're stuck.
|
||
|
By explaining your problem to someone else you often come up with a solution.
|
||
|
|
||
|
Now some links and phrases to google:
|
||
|
|
||
|
* to **manipulate dates** in Python3, your best bet is the [datetime](https://docs.python.org/3/library/datetime.html) library
|
||
|
* definitely have a look at the [strptime](https://stackabuse.com/converting-strings-to-datetime-in-python) **method** of a datetime object
|
||
|
* paths can be manipulated easily with **two different** [libraries](https://www.reddit.com/r/Python/comments/l45ojr/ospath_vs_pathlib/), os **or** pathlib
|
||
|
* files can be **moved** with multiple [libraries](https://stackoverflow.com/questions/8858008/how-to-move-a-file) as well
|