# The FTP challenge ## Getting the file The first step to solving this exercise is to download the CSV file to your Raspberry. For those wondering what on earth a CSV file is I invite you to a detailed [read](https://en.wikipedia.org/wiki/Comma-separated_values) but to make a long story short it stands for *comma separated values* and is one of the most basic ways to structure data. You can use Libreoffice calc to open it and you'll quickly understand how it works. Now, to get the file onto our Raspberry PI we need to download it from the webserver. One way would be to use the `wget` program to do so. Can you think of some alternative ways? We know *where* the server is because we know it's IP address, plus we also know the *filename*. Putting these two together we can construct the following line. ```bash wget 172.30.6.96/accounts.csv ``` This will download the file to the directory we're in and will save it as `accounts.csv`. You can change the output filename it you want, just have a look at the `wget` options via `wget --help` or our trusty `man wget`. ## Extracting the data we need A quick `cat accounts.csv` gives us the following output: ```bash EMAIL,LASTNAME,FIRSTNAME,MATRIX,GITEA,TEAM 1h.lust.hugo@gmail.com,Lust,Hugo,@hugo_lust:86thumbs.net,https://gitea.86thumbs.net/Hugo,red ticus@kraland.net,krstev,vladimir,@vl4dd:86thumbs.net,https://gitea.86thumbs.net/vl4dd,blue adamd@outlook.be,Adam,David,@adamd73:matrix.org,https://gitea.86thumbs.net/adamd,red nicohawai@gmail.com,Perez,Nicolas,@hawai:86thumbs.net,https://gitea.86thumbs.net/Hawai,blue nicolas.wattripont@gmail.com,Wattripont,Nicolas,@wawa142:86thumbs.net,https://gitea.86thumbs.net/wawa142,red laurentdelvigne@hotmail.com,Delvigne,Laurent,@ldelvigne:86thumbs.net,https://gitea.86thumbs.net/ldelvigne,blue sselcukaslan@gmail.com,Aslan,Selçuk,@slck:86thumbs.net,https://gitea.86thumbs.net/selcuk,blue Sarah24886@hotmail.com,Rmiki,Sarah,@sarahrm95:matrix.org,https://gitea.86thumbs.net/sarahrm95,blue knoppixs@hotmail.com,Abbamoulay,Abdellah,@knoppixs:86thumbs.net,https://gitea.86thumbs.net/Abdellah,red JonathanDechief@hotmail.com,Dechief,Jonathan,@elewene:matrix.org,https://gitea.86thumbs.net/Elewene,red 51207@etu.he2b.be,,Aliou,@aliou:86thumbs.net,https://gitea.86thumbs.net/aliou,blue ``` The last line is a rather interesting one that illustrates how CSV files work. Notice that *Aliou* does not have a LASTNAME so it's just two consecutive `,` to mark the empty field. ### The $USERNAME To extract the `$USERNAME` we're interested in we need the fourth column which has the MATRIX login handle. We can extract only this column by using `cut` program with `,` as a delimiter. ```bash cat accounts.csv | cut -d "," -f 4 ``` This leaves us with only the MATRIX login handles which is a good start but there is still a bit too much information . We need to drop the first line, which is the *header* of the CSV file, plus crop between the `@` and the `:`. These two operations can be done in multiple ways but I suggest these additional pipes. It is not the most elegant solution but it uses only tools you have used so far. The `tail` command drops the *header* line, and the two `cut` commands trim the username to just what we need. Can you think of a better way to do this? Remember `tr` from your [bandit](https://www.overthewire.org) days? ```bash cat accounts.csv | cut -d "," -f 4 | tail -n +2 | cut -d ":" -f 1 | cut -d "@" -f 2 ``` Done! Now we have all the usernames we need and we can save this to a file by redirecting the STDOUT to a file with the following command. ```bash cat accounts.csv | cut -d "," -f 4 | tail -n +2 | cut -d ":" -f 1 | cut -d "@" -f 2 > usernames.list ``` Vladimir pointed out a handy way to replace the `tail` command with a `grep`. It's less cryptic and would go as follows. The result is the same but the way we get there is slightly different. ```bash cat accounts.csv | grep "@" | cut -d "," -f 4 | cut -d ":" -f 1 | cut -d "@" -f 2 ``` ### The $PASSWORD To extract the password we need to combine two field from the CSV file. A *really* good command line program to achieve this is `awk`. We haven't used it but [this](https://linuxhandbook.com/awk-command-tutorial/) is a good tutorial. Don't forget the man pages! ```bash cat accounts.csv | awk -F "," '{print $2 $3}' | tail -n +2 ``` Sarah found an interesting feature to `cut` where you can show multiple fields at the same time. The syntax is quite easy but it introduces a `,` we'll have to get rid of afterwards. Combined with Vladimir's approach this gives a more comprehensible command. ```bash cat accounts.csv | grep "@" | cut -d "," -f 3,2 | tr -d "," ``` If you feel like making the password complexer, you can try to add in extra data into the `awk` command, or even append random numbers to the end. How would you do this? ```bash cat accounts.csv | awk -F "," '{print $2 "_helloworld_" $3}' | tail -n +2 ``` We can now save these passwords to a file the same way we did before. ### The $GROUP This is an *easy* one because it requires no real modification of the field. ```bash cat accounts.csv | tail -n +2 | cut -d "," -f 6 > groups.list ``` ## Using this information to create accounts We now have commands that extract the information we need, plus three separate files that contain all the information as lists. This is a good moment to introduce you to writing a very simple script. I'll do it without a loop but you'll quickly understand it's a *lot* easier and more functional with a loop in there. Remember that `$1` represents the first argument on the command line so that when calling our script with `./script.sh 3` we'll get the username, password and group for the third user. A combination of `head | tail` is a [classic](https://stackoverflow.com/questions/6022384/bash-tool-to-get-nth-line-from-a-file) way of selection only one specific line from a file. Last but not least, don't forget to add execution permissions to this script with `chmod`. ```bash #!/bin/bash head -$1 usernames.list | tail -1 head -$1 passwords.list | tail -1 head -$1 groups.list | tail -1 ``` Calling our script `./script.sh 1` outputs the necessary information we can *copy/paste* to compliment the following command. We'll be prompted to paste in the proper information. ```bash sudo adduser $USERNAME ``` Needless to say this is a labour intensive operation that we can automate adding some extra commands to our script. ### Putting it together as a script Brace yourselves a bit but I promise it's worth it! The only thing we have not seen is how to save the output of a command into a variable. This can be done with the `$(...)` syntax. I know it looks a bit cryptic but an example speaks more than words. ```bash NOW=$(date) echo $NOW ``` With this in mind, the following code should make sense. We're doing the exact same thing but saving the output of each command into a variable. At the last line we *use* the variables to create a message we display on our STDOUT. ```bash #!/bin/bash USERNAME=$(head -$1 usernames.list | tail -1) PASSWORD=$(head -$1 passwords.list | tail -1) GROUP=$(head -$1 groups.list | tail -1) echo "user: $USERNAME password: $PASSWORD group: $GROUP" ``` This just output's all information onto one line, but why not *use* this information to actually create the accounts? A counterpart to the `adduser` program you're used to using, there is `useradd` which is better suited for scripting purposes. By default `useradd` is very *barebones* and does not create a home directory for the user but a quick look at the `man useradd` pages tells us we can use the `-m` flag to do so. This tells us the command `useradd $USERNAME -m` will create a user for us with his/her own home directory. A [google search](https://linux.die.net/man/8/chpasswd) pointed me to `chpasswd` to set passwords from within a script. The syntax, which I found on [stackoverflow](https://unix.stackexchange.com/questions/197448/change-password-programmatically) used to set the password will be `echo $USERNAME:$PASSWORD | chpasswd`. This gives us the following script. ```bash #!/bin/bash USERNAME=$(head -$1 usernames.list | tail -1) PASSWORD=$(head -$1 passwords.list | tail -1) GROUP=$(head -$1 groups.list | tail -1) echo "adding user: $USERNAME" useradd $USERNAME -m echo "setting password: $PASSWORD for $USERNAME" echo $USERNAME:$PASSWORD | chpasswd ``` You probably noticed I did not add the users to their *red/blue* groups. We can add them to the `useradd` line by using the `-G` flag but it would fail if the group does not exist yet. The `groupadd` command will add a group to the system and if we add the `-f` flag to it will do so without complaining if the group already exists. This way we can just execute that line each time without worrying whether the group exists or not. Nice! ```bash #!/bin/bash USERNAME=$(head -$1 usernames.list | tail -1) PASSWORD=$(head -$1 passwords.list | tail -1) GROUP=$(head -$1 groups.list | tail -1) echo "making sure $GROUP exists..." groupadd -f $GROUP echo "adding user: $USERNAME" useradd $USERNAME -m -G $GROUP echo "setting password: $PASSWORD for $USERNAME" echo $USERNAME:$PASSWORD | chpasswd ``` Those who switched to the newly created user to check whether they actually *work* probably noticed that the shell is *very* basic one. You can find out which shell these new accounts use by looking at the `/etc/passwd` file. There are multiple ways to sort this problem but a look at the `man useradd` pages tells us we can use the `-s` flag to set the shell we want for the user we're creating. We probably want to use `/bin/bash` for this option! ```bash #!/bin/bash USERNAME=$(head -$1 usernames.list | tail -1) PASSWORD=$(head -$1 passwords.list | tail -1) GROUP=$(head -$1 groups.list | tail -1) echo "making sure $GROUP exists..." groupadd -f $GROUP echo "adding user: $USERNAME" useradd $USERNAME -m -G $GROUP -s "/bin/bash" echo "setting password: $PASSWORD for $USERNAME" echo $USERNAME:$PASSWORD | chpasswd ``` This is getting pretty close to perfect! We can now run through all of the lines of the file, one by one, and automatically create the proper user, password and group combinations. To know how many accounts we have to create we can use `wc -l accounts.csv` and then just run the script, incrementing the number each time. ```bash sudo ./script.sh 1 sudo ./script.sh 2 sudo ./script.sh 3 echo "etc..." ``` ### Taking it further as an extra challenge If we want to automate the entire thing we'll need a loop to *loop through* every line of the `accounts.csv` file. Bash loops are for a future class but I'll leave you with this quick example for those who feel like messing around. Remember the *oneliners* we constructed at the beginning to extract the relevant information from the line! Don't worry if this looks to complicated at the moment, we'll do this exercise again when we're looking into [bash scripting](https://ryanstutorials.net/bash-scripting-tutorial/). ```bash #!/bin/bash LINES=$(cat $1) for LINE in $LINES; do echo $LINE done ``` ## Setting up the fileserver The concept of a fileserver is about as old as the world wide web, email, newsgroups etc. It's main purpose in life is to offer a place to store and retrieve large files. The Debian repositories where you download all your packages from is contactable over the FTP protocol. There are multiple implementations available, some plain, some secure, some less secure. We'll have a look at the two most common solutions to host and transfer files. To use the FTP protocol as a client we can install any FTP client we can [find](https://www.slant.co/topics/12056/~ftp-clients-for-linux). A good candidate is `filezilla` which you can find in the Debian repositories, or online if you want to use it on Windows. ### FTP A quick [google](https://likegeeks.com/ftp-server-linux/) tells us `vsftpd` is a popular *FileTransferProtocol* server. As expected you can find it in the main Debian repositories. ```bash sudo apt search vsftpd ``` You know how to install it by now! To know *where* to configure the server we can look at the `man vsftpd` pages. Scroll all the way to the bottom to see which files it uses to configure itself. The configuration file itself is very *verbose* and should explain itself. Before changing the configuration do have a quick read of *Taking changes to configuration files into account*. ### SFTP Even without installing any additional server, we can offer our users a way to download, and upload, files to our server. When you install `openssh-server`, and pay attention to the installation dialog, you'll see it will also install an additional package called `openssh-sftp-server` This in itself is enough to have an SFTP server up and running. Done! Try it out on one of your virtual machines or the Raspberry PI. It does raise the question of security. When installing the `vsftpd` package your users can *only* upload and download files but when you install an ssh server they can also get shell access! This might not be the desired behaviour for all your users. Luckily we can change the system, or server, configuration to modify it to our requirements. I can think of two ways to limit the shell access for specific users. The first one should ring a bell. As we define our default shell, found in the `/etc/passwd` file, to be `/bin/bash`, we can also change this for specific users to `/usr/sbin/nologin`. This will block those users from getting an actual shell when sshing into the machine but SFTP should still work. There are multiple downsides to this solution, some of which we'll investigate further later one, but the main one I see from a beginner point of view is that it requires a customisation for each account we want to block. As all users we want to block are part of a group, either *red* or *blue* there should be a way to restrict access on a group based level no? This will be done by modifying the ssh server. How will we find *which* configuration files we need to change and *where* can be find those files? `man sshd` should do the trick. Navigate to to end of the manual and you'll see a huge list of files the server uses to configure itself. Why `sshd` and not `ssh` you may ask? Well, `ssh` is the **client** you use to connect to the `sshd` **server** and they are two different *programs*. Remember that `ssh` was installed on your Linux machine even before installing `openssh-server`! It's the last lines of the configuration file that point out some interesting features. ```bash # override default of no subsystems Subsystem sftp /usr/lib/openssh/sftp-server # Example of overriding settings on a per-user basis #Match User anoncvs # X11Forwarding no # AllowTcpForwarding no # PermitTTY no # ForceCommand cvs server ``` Remember that the `#` are comments and not taken into account when the server configures itself. These settings point out that we can add different rules for different users, and probably groups as well. So, if you trust all people in the *blue* group, you can add a rule to match all users in the *red* group and restrict them to only sftp login and no shell access. With this in mind, what would adding the following to the configuration file do? ```bash Match Group blue ForceCommand internal-sftp ``` But the users can still walk around the entire file system, which is not always a good idea. Using the keywords **sftp**, **chroot**, **group** in [google](https://googlethatforyou.com?q=sftp%20chroot%20group) we find a link to the [arch](https://wiki.archlinux.org/index.php/SFTP_chroot) wiki which is *always* a solid place to inform ourselves. You can have a go this as an additional exercise! ### Taking changes to configuration files into account When make these changes to the configuration files you probably noticed they are not applied immediately. So when are they taken into account then? Most servers we'll install onto our various machines run as [daemons](https://en.wikipedia.org/wiki/Daemon_(computing)) in the background. They listen for an incoming client connection on a port, do whatever the client requests of them and go back to sleep when there are no tasks for them to perform. They might have internal tasks scheduled but we'll set this aside for the moment. As with most things Linux, configuration is done by a text file and *most* daemons will read their corresponding configuration files upon startup. Think of the .bashrc file, which is read every time a new shell is started. Once they know how to behave, from reading their configuration file, they will act accordingly. This raises the question, how do you *restart* a running server? The system responsible for launching most daemons on our Debian installations, and also our Raspberry PI's, is called **systemd**. The thee most used commands you need to know are as follows. How do to *start* a service, how to *stop* one and how to *check* if one is running properly. For system daemons, such as `sshd` or most webservers, you need administrator powers to interact with them. ```bash sudo systemctl start sshd.service sudo systemctl stop sshd.service sudo systemctl status sshd.service ``` These three commands will go a long way but there are other handy ones you can try out. I advise you to use *tab-completion* as much as possible as it will help you to construct the proper commands. Especially at the beginning of your Linux journey because it takes some time to remember specific service names. Have a look at all the base commands of `systemctl` and you'll notice quite a few interesting things. For example, it can be used to reboot or shutdown your computer as well! For those who want to dig deeper into systemd itself I advise you to have a browse in it's configuration directory. You'll find it at `/etc/systemd` and the manpages `man systemd` are also very helpful, but a bit verbose. We will have a deeper look at the internals of systemd later throughout the course.