Installing Whisper AI to a an entry level GPU

Obviously, as you can see from my blog, I have a bunch of high end GPUs for my AI work, the GPU I use on my daily driver PC on the other hand is a complete joke (Nvidia rtx 1650) with 4GB of ram… Not exactly a GPU you would use for anything remotely demanding

But running whisper on my local machine is very convinient, the audio files are already there, no need to login to any remote machines and the like, so i will be installing a small version of whisper here, and let us see how this ancient GPU does

1- I already have Python 3.12.7 installed, if you don’t, then “sudo apt install python3 python3-pip”

python3 -m venv whisper-env

And activate it

source whisper-env/bin/activate

Now, before you procede, if you want your “HuggingFace” directory on a different drive or something (Where the models actually live), you should start by adding the following line to ~/.bashrc or whatever your system uses, also remember to either run (source ~/.bashrc) or to close and open your terminal again for the changes to take effect

export HF_HOME=/mnt/bigdrive/huggingface

Now, let us go ahead and install faster-whisper

pip install faster-whisper

Also, make sure PyTorch with GPU support is available:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Test GPU availability:

python3 -c "import torch; print(torch.cuda.is_available())"

Now, I thought tiny would be the correct size that suits my GPU, but it turned out “mini”base” works just fine !

faster-whisper sample.wav --model-size base --compute-type float16

How do we know if we are hitting the GPU limits ?

watch -n 1 nvidia-smi

12TB disk does not show up

I have been using an intel “D525mw” intel atom system as a network attached storage system for some time now, I have an extra SATA PCIe card (Silicon Image, Inc. SiI 3132) so that I can connect 4 disks, when the 12TB western digital disk (HGST HUH721212AL) is connected to the external SATA card, it does not show up, meaning, an “fdisk -l” does not bring it up !

So the next thing to do is swap the SATA connection with a different disk connected to the motherboard, and suddenly it works, amazing, but I need to know where the problem comes from

The first theory is that disks that are SFF-8447 compliant (rather than the old IDEMA standard) are not supported by this controller !

DVD VOB files to MP4 with FFMPEG

Let us start by analyzing our input files and what we expect to find

  • A video stream, the obvious
  • Audio streams for different languages
  • Audio streams that should play together
  • dvd_subtitle

dvd_subtitle

Let us start with the subtitles, the encoder/decoder for dvd_subtitle stream is either dvdsub or dvbsub.

According to documentation, This codec decodes the bitmap subtitles used in DVDs; the same subtitles can also be found in VobSub file pairs and in some Matroska files.

Metroska uses VobSub, vobsub is basically the same as the stream in DVD but in a file with the extension “*.sub”.

So, I have a stream with “dvd_subtitle” on stream 0:3, to extract, I will probably use a command such as

Extracting the subtitle files

ffmpeg -i input.VOB -c copy -map 0:3 subtitles.sub

Now, at this stage, if you want to convert them to srt for example (Or a text base subtitle system), you will need to use a tool that has OCR !! Such a tool is VobSub2SRT =>

Another method after extracting is using an online service such as this one (https://subtitletools.com/convert-to-srt-online) to turn subtitles into SRT

Find in files and replace in files in Linux

If you are looking for functionality similar to notepad++ functionality of replacing a string in all the files in a certain folder, Look no more, in linux this is a simple command

Assuming you are on the command line and the directory containing the files is the active one (cd)

the following line will replace foo with bar in all files on the current root directory but not in subdirectories

sed -i -- 's/foo/bar/g' *

If you want it to go recursively into sub directories, you can combine the above with the find command

 find . -type f -exec sed -i 's/foo/bar/g' {} +

if you want sed to backup the files before it does the replace, use the following command, you can replace the .bak with anything you like

sed -i.bak -- 's/foo/bar/g' *

SD cards trim command

Many modern SD cards support the trim command () in Linux, the problem is that not many SD card readers (Mostly USB) do

So running () on a mounted card will usually return an error such as

fstrim -v /hds/usb
fstrim: /hds/usb: the discard operation is not supported

The command “lsblk –discard /dev/sdf” should tell you whether your SD card/adapter combination support trim right now, so the failure of the fstrim command above can be predicted in advance with this command

Now, assuming your SD card is not mounted, and you need to format it, Will formatting restore the speed ? unfortunately not, to restore the speed, before formatting your SD card, you might want to run the blkdiscard function, mind you, this command will delete all your data !

blkdiscard -f /dev/sdf

but even that might not work

blkdiscard: Operation forced, data will be lost!
blkdiscard: /dev/sdf: BLKDISCARD ioctl failed: Operation not supported

Adding an internal network to KVM

A private network connects select virtual machines to other virtual machines on the same host, and to the host itself, I usually use it to use samba shares between all virtual machines without giving those virtual machines access to the internet.

To do this, you will need to add a vridge to the host computer without an actual network interface that the bridge connects to, you can also add DHCP if you don’t care to hard code the IP addresses, the virtual machine can then use this interface to talk to other virtual machines or the host itself, A virtual machine can have both this network interface and another that does have access to the internet if you so chose

Continue reading “Adding an internal network to KVM”

Free SSL certificates with Let’s encrypt, step by step

Let’s encrypt is a Certificate Authority (CA) run by Internet Security Research Group (ISRG), and is sponsored by some of the biggest name in the web industry

You are probably here to create a certificate, not get a history lesson ! so Let me cut the chase, for those who want to know more, there is always wikipedia (Let’s encrypt on Wikipedia)

So let’s encrypt provides certificates for domain names, including wildcard certificates (Which I will get to by the end of this article), What we are going through here is the manual process, which serves to give you a taste of how things work, in practice, you are encouraged to use on of the automated methods for multiple reasons, one compelling such reason is that Let’s encrypt issues certificates valid for three months only ! You don’t want to have to cater to your certificate every three months do you ?

To simplify things, I will create a step by step video to demonstrate the creation process ! and post it here, but for now, I will simply take you through the steps, in this tutorial, all you need is SSH access to any server including one you have at home ! or even maybe a virtual machine running Linux inside your windows computer, anything goes, once you have a certificate, you can move it to your production server, this allows me to keep this as general as possible, and this is done using the –manual option, So without further ado, let me get to it

1- login to a linux server and install certbot, the tool that allows you to get certificates from let’s encrypt, On the official website, they promote the use of SNAP, here, I will skip snap and use Debian’s repository ! simpler and there is no need to get into snap

apt install certbot

Now that you have certbot, let us create a certificate for the domain example.com (replace it with your own)

certbot certonly --manual --preferred-challenges http

The –preferred-challenges directive allows you to specify what challenge (http or dns) you would like to perform, the manual plugin is basically the same as webroot plugin but not automated, which is a hassle to keep up to date as this form of issuance needs to be renewed manually every 3 months, (You can take extra steps to automate this) which i will describe later on another post to keep things tidy

Now, as soon as you enter the above, you will enter an interactive dialogue with the following steps

Note: If you want to create a wildcard certificate for your domain name, let’s encrypt allows the use of the * wildcard, but only supports DNS challenge, so the command must reflect that, So when asked for a domain, simply enter *.example.com (or -d ‘*.example.com’), should work normally

As soon as you are in, you will be asked

1- An email for notifications
2- Do you agree to the terms of service ?
3- Would you like to subscribe to the newsletter ?
4- enter your domain names (you should enter both example.com and www.example.com separated by either a comma or a space)
5-

Create a file containing just this data:

Pg1xJ.........-88

And make it available on your web server at this URL:

http://example.com/.well-known/acme-challenge/Pg1...........xuu_0

6- Now you need to create the 2 challenge files, one for exmaple.com and the other for WWW.example.com

Create a file containing just this data:

Ud4m81x..............zupbWEz-88

And make it available on your web server at this URL:

http://www.example.com/.well-known/acme-challenge/Ud4........550

(This must be set up in addition to the previous challenges; do not remove,
replace, or undo the previous challenge tasks yet.)

--------------------------


IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/example.com/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/example.com/privkey.pem
   Your certificate will expire on 2023-03-11. To obtain a new or
   tweaked version of this certificate in the future, simply run
   certbot again. To non-interactively renew *all* of your
   certificates, run "certbot renew"
 - If you like Certbot, please consider supporting our work by:

   Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
   Donating to EFF:                    https://eff.org/donate-le

At this stage, there are things you should remain aware of

1- DO NOT RENAME OR MOVE THE CERTIFICATES, they need to be in place for renewal if you decide to not automate and check on your certificates every 3 months.

2- Copy (Don’t move) them to the ssl directory, and add them to your config files, the only files you will need to include in your nginx or apache2 config are as follows

For apache 2, you need to use the following 2 lines, modify the path to the files to wherever you have placed them

      SSLCertificateFile /etc/apache2/ssl/example.com/fullchain.pem
      SSLCertificateKeyFile /etc/apache2/ssl/example.com/privkey.key

And for nginx

        ssl_certificate /etc/nginx/ssl/allspots.com/fullchain.pem;
        ssl_certificate_key /etc/nginx/ssl/allspots.com/privkey.pem;

So, restart apache or nginx, and you should be able to see the certificate in action, so this is the simplest way to use let’s encrypt, in my next post, I will

Now, after 3 months, the simplest way to renew the certificate is to issue the command

certbot certonly --force-renew -d example.com www.example.com

Mounting unclean NTFS windows drive in Linux

Whenever i get the following message

mount /dev/sdd1 /hds/sgt2tb
The disk contains an unclean file system (0, 0).
Metadata kept in Windows cache, refused to mount.
Falling back to read-only mount because the NTFS partition is in an
unsafe state. Please resume and shutdown Windows fully (no hibernation
or fast restarting.)
Could not mount read-write, trying read-only

The command

ntfsfix /dev/sdd1

resolves the issue, and produces the following message

Mounting volume... The disk contains an unclean file system (0, 0).
Metadata kept in Windows cache, refused to mount.
FAILED
Attempting to correct errors...
Processing $MFT and $MFTMirr...
Reading $MFT... OK
Reading $MFTMirr... OK
Comparing $MFTMirr to $MFT... OK
Processing of $MFT and $MFTMirr completed successfully.
Setting required flags on partition... OK
Going to empty the journal ($LogFile)... OK
Checking the alternate boot sector... OK
NTFS volume version is 3.1.
NTFS partition /dev/sdd1 was processed successfully

The same mount command you see here will now work flawlessly

mount /dev/sdd1 /hds/sgt2tb

I am still unsure what process from the mentioned above is responsible, as this oftentimes pops up on drives that were never system drives, so there is no hibernation file problem