<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Ceph on blog.iankulin.com</title><link>https://blog.iankulin.com/tags/ceph/</link><description>Recent content in Ceph on blog.iankulin.com</description><generator>Hugo</generator><language>en-AU</language><lastBuildDate>Sun, 23 Jul 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.iankulin.com/tags/ceph/index.xml" rel="self" type="application/rss+xml"/><item><title>Proxmox 8.0 Install</title><link>https://blog.iankulin.com/proxmox-8-0-install/</link><pubDate>Sun, 23 Jul 2023 00:00:00 +0000</pubDate><guid>https://blog.iankulin.com/proxmox-8-0-install/</guid><description>&lt;p&gt;I&amp;rsquo;m normally a x.1 release type of sysadmin, but the increasing temptation of installing Proxmox 8.0 while I&amp;rsquo;ve got some time off, and the fact that I&amp;rsquo;ve got a cluster, so I can just move the VM&amp;rsquo;s around all adds up to thinking I&amp;rsquo;ll do that today.&lt;/p&gt;
&lt;img src="https://blog.iankulin.com/images/cluster-2.png" width="328" alt=""&gt;
&lt;p&gt;Here&amp;rsquo;s how my system works. It consists of three HP-800 mini G2&amp;rsquo;s. &lt;code&gt;pve-prod1&lt;/code&gt; is a bit fancier - i7 6700T and 32GB, the other two are i5 6500T and 16GB. The production VM&amp;rsquo;s use the local SSD but backups go to the NAS. All the machines are currently running Proxmox 7.4. They are not clustered in the proper sense - I don&amp;rsquo;t need high availability, and I don&amp;rsquo;t want to run them all the time. &lt;code&gt;pve-prod1&lt;/code&gt; runs 24/7 and I just power up &lt;code&gt;pve-dev1&lt;/code&gt; when I&amp;rsquo;m working on something.&lt;/p&gt;
&lt;p&gt;The intention is that although I&amp;rsquo;m not on high availability, I can quickly come back from a machine failure by powering &lt;code&gt;pve-prod2&lt;/code&gt; up and restoring from the latest VM backup from the NAS. &lt;code&gt;pve-prod1&lt;/code&gt; does not have a full load yet (I&amp;rsquo;m slowly cancelling cloud services and moving them in-house) but once it does, I&amp;rsquo;d have the capacity to fully replace it by sharing any guests between &lt;code&gt;pve-prod2&lt;/code&gt; and &lt;code&gt;pve-dev1&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id="migration-plan"&gt;Migration plan&lt;/h3&gt;
&lt;img src="https://blog.iankulin.com/images/migration-1.png" width="273" alt=""&gt;
&lt;p&gt;Currently &lt;code&gt;pve-prod1&lt;/code&gt; is only running two guests, jellyfin, and a docker host with a collection of smallish services. The plan is to move those to &lt;code&gt;pve-prod2&lt;/code&gt;, check everything is working, then install the new Proxmox 8 onto &lt;code&gt;pve-prod1&lt;/code&gt;. Apart from giving me the opportunity to do that, it&amp;rsquo;s a good test of the plan for recovering from a &lt;code&gt;pve-prod1&lt;/code&gt; failure. I&amp;rsquo;ll live off it for a few days to ensure that it&amp;rsquo;s a viable process.&lt;/p&gt;
&lt;p&gt;A small hitch with this is that the RAM in &lt;code&gt;pve-prod1&lt;/code&gt; cost me $100, and I didn&amp;rsquo;t want to not use it, so I created the jellyfin VM with 16GB RAM. It&amp;rsquo;s a simple matter to stop it, give it less, and restart it - except it seems to be using it all.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-07-04-at-7.31.59-am.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;You can see from this, I tried shutting it down and restarting - thinking that the memory use might climb up slowly as the app was used, but it just went straight back to 15GB. In a way, I approve of a VM using the memory I&amp;rsquo;ve given it - presumably it is caching or something. Jellyfin should certainly be able to run on a machine with much less memory, so I suppose I&amp;rsquo;ll stop it, back it up, and try it in a smaller VM.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-07-04-at-7.42.58-am.jpg" alt=""&gt;&lt;/p&gt;
&lt;p&gt;Yep, that works fine. And I can&amp;rsquo;t notice any difference in the app performance. So I stopped it, backed it up, and restored onto prod2. And immediately bumped into a couple of problems when I tried to start it.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-07-04-at-8.52.34-am.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;There was two hardware incompatibilities - the first was that on prod1 I had passed through the GPU from the host (in an unsuccessful attempt to use quicksync hardware transcoding for video). I don&amp;rsquo;t need that, so that gets deleted out of the &amp;lsquo;hardware&amp;rsquo; for the VM.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-07-04-at-8.47.00-am.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;And the second was that I still had the Debian 11 ISO mounted in the &amp;lsquo;cd-rom&amp;rsquo;. Lol - the Debian installer specifically tells you to remove this before it reboots. That can be removed exactly as I had done for the GPU pass through, and the VM boots fine, and the app tests out ok.&lt;/p&gt;
&lt;p&gt;The first time I ever did this - move a guest VM from one lot of hardware to another, then boot it up and all my apps are working perfectly on their old IP addresses - I was amazed and danced around in excitement. I didn&amp;rsquo;t dance today, but it is so cool.&lt;/p&gt;
&lt;p&gt;Interestingly, it&amp;rsquo;s decided to use much less RAM now. I caused that increase at the end of the graph by rescanning the media library, then browsing through all the titles so the cover images would have to be loaded - so perhaps it&amp;rsquo;s the web server caching them all. It&amp;rsquo;s hard to know for sure without some objective measurements, but I suspect the app was crisper and more responsive than before. In any case, it certainly wasn&amp;rsquo;t any worse.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-07-04-at-9.02.56-am.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;Moving the docker host over was straightforward and only took five minutes of downtime as it&amp;rsquo;s a smaller image. I guess a lot of that time is just my 1GB network limitation or the spinning disk transfer speed from the NAS - the docker hoats was 4GB and Jellyfin 14GB.&lt;/p&gt;
&lt;h3 id="nuke-and-pave"&gt;Nuke and pave&lt;/h3&gt;
&lt;p&gt;I try and keep my hosts very clean, so wiping them and starting over is no biggie, but since this node has been up I have installed a chron job for &lt;a href="https://blog.iankulin.com/linux-shell-script-for-temperature-logging/"&gt;temperature logging&lt;/a&gt;. I&amp;rsquo;ve documented that in a blog post so I&amp;rsquo;ll be able to recreate it, but this sort of thing is the reason I&amp;rsquo;m interested in &lt;a href="https://blog.iankulin.com/getting-started-with-ansible/"&gt;Ansible&lt;/a&gt;. Another project while I&amp;rsquo;ve got some time will be to recreate that on the new machine with Ansible so it&amp;rsquo;s trivial to restore in future. I pulled the temperature log file down though - because who doesn&amp;rsquo;t like eighty thousand data points.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/temp1.jpg" alt=""&gt;&lt;/p&gt;
&lt;p&gt;There is a &lt;a href="https://pve.proxmox.com/wiki/Upgrade_from_7_to_8"&gt;published process to upgrade Proxmox&lt;/a&gt; from 7.x to 8, so I briefly considered it, but fresh installs are generally less likely to lead to drama, especially this early in the major release cycle. Plus, I keep my installs clean to allow it - this is a freedom allowed by my sysadmin discipline along with the investment in redundant hardware so there&amp;rsquo;s zero time pressure while I&amp;rsquo;m doing it.&lt;/p&gt;
&lt;h3 id="run-book-for-new-proxmox-install"&gt;Run Book for New Proxmox Install&lt;/h3&gt;
&lt;p&gt;My install process for Proxmox goes something like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Flash the ISO onto a USB drive with &lt;a href="https://etcher.balena.io/"&gt;Balena Etcher&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Plug in the USB drive, my bluetooth keyboard/mouse USB, and the screen - I&amp;rsquo;ve got a special long HDMI cord that reaches from my desk to the servers&lt;/li&gt;
&lt;li&gt;Boot up, mashing the boot menu key (F9 on my G2&amp;rsquo;s)&lt;/li&gt;
&lt;li&gt;Follow my nose through the prompts - since this is an existing server, the DHCP serves up the correct IP address&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ssh&lt;/code&gt; into it to check everything&amp;rsquo;s fine. Since this IP was already in my known hosts file, I had to go an delete it out&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ssh-copy-id&lt;/code&gt; to get my ssh keys across&lt;/li&gt;
&lt;li&gt;Update the repositories - by default, Proxmox comes set up to use with a subscription. I wish they had a lower tier and I&amp;rsquo;d by one since it gives me so much joy - even if it didn&amp;rsquo;t remove the nags. In the meantime, you can follow the instructions &lt;a href="https://pve.proxmox.com/wiki/Package_Repositories#sysadmin_no_subscription_repo"&gt;here&lt;/a&gt; to set it up to use the non-subscription repoistories:
&lt;ul&gt;
&lt;li&gt;edit &lt;code&gt;/etc/apt/sources.list&lt;/code&gt; to add &lt;code&gt;deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;edit &lt;code&gt;/etc/apt/sources.list.d/pve-enterprise.list&lt;/code&gt; to comment out the line in there&lt;/li&gt;
&lt;li&gt;and a new one that&amp;rsquo;s not mentioned on that wiki page, edit &lt;code&gt;/etc/apt/sources.list.d/ceph.list&lt;/code&gt; to comment out the line in there. I don&amp;rsquo;t know where that leaves you if you are using Ceph (which is a cool file system if you&amp;rsquo;re using high availability) but I&amp;rsquo;m not, so all good. If you don&amp;rsquo;t do this, you&amp;rsquo;ll get errors like &lt;code&gt;E: Failed to fetch https://enterprise.proxmox.com/debian/ceph-quincy/dists/bookw orm/InRelease 401 Unauthorized IP: 103.76.41.50 4431 E: The repository &amp;quot;https://enterprise.proxmox.com/debian/ceph-quincy bookworm In Release' is not signed.&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Run the updates with &lt;code&gt;apt update&lt;/code&gt; &amp;amp;&amp;amp; &lt;code&gt;apt upgrade&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Install the certificate - you need SSL setup for the web interface if you want Chrome to let it save your password, which I do. Also the red &lt;em&gt;insecure&lt;/em&gt; message bugs me
&lt;ul&gt;
&lt;li&gt;Log into the web interface at https://&lt;ip address&gt;:8006 - you&amp;rsquo;ll need to jump through all those hoops to take on the responsibility of opening an unsecured site&lt;/li&gt;
&lt;li&gt;If you click on the node, then certificates&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-07-04-at-12.08.29-pm.png" alt=""&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;ul&gt;
&lt;li&gt;You can open up that certificate, and copy out the raw certificate, paste it into a text editor and save it somewhere. I drag that into my macOS keychain app. It shows up with a red cross, but if you open it up you can mark it as &amp;ldquo;always trust&amp;rdquo;&lt;/li&gt;
&lt;li&gt;We&amp;rsquo;re not done yet, now back in Chrome, click on the &lt;em&gt;insecure&lt;/em&gt; message next to the URL. Go into &lt;em&gt;Site Settings&lt;/em&gt; | &lt;em&gt;Insecure Content&lt;/em&gt; and change it to &lt;em&gt;Allow&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Almost there - at the top of those settings is a button to clear the cache, do that&lt;/li&gt;
&lt;li&gt;Reload the page. Profit.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Then I &lt;a href="https://tailscale.com/kb/1031/install-linux/"&gt;install Tailscale&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Last of all, add my NAS to the storage. I use NFS. The only trick here is to go into the dropdown of what type of content is on that storage, and select everything&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-07-04-at-12.17.35-pm.jpg" alt=""&gt;&lt;/p&gt;
&lt;p&gt;And that&amp;rsquo;s it. Nice new Proxmox. I&amp;rsquo;ll leave my production VM&amp;rsquo;s on pve-prod2 for a week, and move all of my dev work over to this machine so it gets some exercise before I upgrade the other machines.&lt;/p&gt;
&lt;h3 id="tailscale"&gt;Tailscale&lt;/h3&gt;
&lt;p&gt;The only small issue I ran into (apart from the Ceph repository) was I couldn&amp;rsquo;t access the machine via it&amp;rsquo;s &amp;ldquo;magic DNS&amp;rdquo; Tailscale name. Since it was going to be the same name as a machine in my existing network, I&amp;rsquo;d thought ahead and deleted the old one out via the &lt;a href="https://login.tailscale.com/admin/machines"&gt;Tailscale machines&lt;/a&gt; page, but even so, it wouldn&amp;rsquo;t connect from my laptop.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-07-04-at-11.45.38-am.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;I assume the old Tailscale IP address was cached somewhere, and fixed it by turning Tailscale off and on again on my laptop.&lt;/p&gt;</description></item></channel></rss>