<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Grep on blog.iankulin.com</title><link>https://blog.iankulin.com/tags/grep/</link><description>Recent content in Grep on blog.iankulin.com</description><generator>Hugo</generator><language>en-AU</language><lastBuildDate>Wed, 03 May 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.iankulin.com/tags/grep/index.xml" rel="self" type="application/rss+xml"/><item><title>Outside Temperature From an API in a Shell Script</title><link>https://blog.iankulin.com/outside-temperature-from-an-api-in-a-shell-script/</link><pubDate>Wed, 03 May 2023 00:00:00 +0000</pubDate><guid>https://blog.iankulin.com/outside-temperature-from-an-api-in-a-shell-script/</guid><description>&lt;p&gt;I&amp;rsquo;m interested in &lt;a href="https://blog.iankulin.com/linux-shell-script-for-temperature-logging/"&gt;collecting some internal temperature data&lt;/a&gt; from my servers to look at the effect of adding an NMVe drive. Last week we had a couple of warm days immediately followed by a couple of cool ones. I imagine a 20° ambient temperature change could effect the server temperatures, so I thought it would be good to add that to my temperature logs.&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t have a weather station or other automated system for collecting the temperature, but there are several commercial sources for this data which, while probably not as good as a sensor in the server room, will be fine for our purposes.&lt;/p&gt;
&lt;p&gt;One of the more well known weather APIs was &lt;a href="https://darksky.net/dev"&gt;Dark Sky&lt;/a&gt;, they got bought up by Apple and now similar data is available in the &lt;a href="https://developer.apple.com/weatherkit/get-started/"&gt;WeatherKit API&lt;/a&gt;. I hold a developer program membership, so that would be free to use for the frequency I need, but the API and sign up looked a bit complex, so I looked elsewhere.&lt;/p&gt;
&lt;p&gt;OpenWeather have a &lt;a href="https://openweathermap.org/current"&gt;simple API&lt;/a&gt; (including one intended to make changing over from Dark Sky easy), &lt;a href="https://openweathermap.org/price"&gt;a good free tier&lt;/a&gt;, and simple sign up - no credit card required. On the free tier I can pull the current weather for a location 22 times a minute continuously. Since I&amp;rsquo;m only collecting my server temps on a five minute cycle, that will be more than fine.&lt;/p&gt;
&lt;p&gt;Even thought the API would allow it, it seems wasteful, and greedy (since I&amp;rsquo;m not paying for it), to pull the same data three times (for each of the three servers), so to complicate things (and learn some interesting stuff) I decided to poll the OpenWeather API once every five minutes from my VPS, process that current weather JSON down to just the temperature I was after, then expose that as a http endpoint. Then each of my servers would poll the VPS to get that outside temp as part of their logging.&lt;/p&gt;
&lt;img src="https://blog.iankulin.com/images/20230425-weather.drawio-1.png" width="435" alt=""&gt;
&lt;p&gt;This will all extend involve some scripting that I haven&amp;rsquo;t encountered yet.&lt;/p&gt;
&lt;h3 id="vps--weather-api"&gt;VPS / Weather API&lt;/h3&gt;
&lt;p&gt;The OpenWeather API couldn&amp;rsquo;t be more straightforward, you sign up with an email and get an API token, then it&amp;rsquo;s just this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;https://api.openweathermap.org/data/2.5/weather?lat={lat}&amp;amp;lon={lon}&amp;amp;appid={API key}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;There&amp;rsquo;s a couple of options for language and units, I went with &lt;em&gt;metric&lt;/em&gt;, then you get have some JSON.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;coord&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;lon&amp;#34;: 118,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;lat&amp;#34;: -33.93
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;weather&amp;#34;: [
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;id&amp;#34;: 803,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;main&amp;#34;: &amp;#34;Clouds&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;description&amp;#34;: &amp;#34;broken clouds&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;icon&amp;#34;: &amp;#34;04d&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ],
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;base&amp;#34;: &amp;#34;stations&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;main&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;temp&amp;#34;: 12.59,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;feels_like&amp;#34;: 11.68,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;temp_min&amp;#34;: 12.59,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;temp_max&amp;#34;: 12.59,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;pressure&amp;#34;: 1007,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;humidity&amp;#34;: 68,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;sea_level&amp;#34;: 1007,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;grnd_level&amp;#34;: 976
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;visibility&amp;#34;: 10000,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;wind&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;speed&amp;#34;: 7.39,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;deg&amp;#34;: 307,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;gust&amp;#34;: 11.23
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;clouds&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;all&amp;#34;: 64
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;dt&amp;#34;: 1682401802,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;sys&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;country&amp;#34;: &amp;#34;AU&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;sunrise&amp;#34;: 1682375848,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;sunset&amp;#34;: 1682415263
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;timezone&amp;#34;: 28800,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;id&amp;#34;: 2070753,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;name&amp;#34;: &amp;#34;Gnowangerup&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &amp;#34;cod&amp;#34;: 200
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;From this, I want to extract the temperature, and the unix timestamp &amp;ldquo;dt&amp;rdquo;. Here&amp;rsquo;s my bash script.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#5e81ac;font-style:italic"&gt;#!/bin/bash
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;weather_text&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;curl -s &lt;span style="color:#a3be8c"&gt;&amp;#34;https://api.openweathermap.org/data/2.5/weather?lat=-33.93&amp;amp;lon=118.00&amp;amp;appid=somegiantrandomUIDtypenumber&amp;amp;units=metric&amp;#34;&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;temp_text&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;&lt;span style="color:#81a1c1"&gt;echo&lt;/span&gt; $weather_text &lt;span style="color:#eceff4"&gt;|&lt;/span&gt; awk -F&lt;span style="color:#a3be8c"&gt;&amp;#39;&amp;#34;temp&amp;#34;:&amp;#39;&lt;/span&gt; &lt;span style="color:#a3be8c"&gt;&amp;#39;{print $2}&amp;#39;&lt;/span&gt; &lt;span style="color:#eceff4"&gt;|&lt;/span&gt; cut -d&lt;span style="color:#a3be8c"&gt;&amp;#39;,&amp;#39;&lt;/span&gt; -f1&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;time_text&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;&lt;span style="color:#81a1c1"&gt;echo&lt;/span&gt; $weather_text &lt;span style="color:#eceff4"&gt;|&lt;/span&gt; awk -F&lt;span style="color:#a3be8c"&gt;&amp;#39;&amp;#34;dt&amp;#34;:&amp;#39;&lt;/span&gt; &lt;span style="color:#a3be8c"&gt;&amp;#39;{print $2}&amp;#39;&lt;/span&gt; &lt;span style="color:#eceff4"&gt;|&lt;/span&gt; cut -d&lt;span style="color:#a3be8c"&gt;&amp;#39;,&amp;#39;&lt;/span&gt; -f1&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;log_file&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;&amp;#34;/home/ian/iankulin.com/www/gnp_temp.txt&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#81a1c1"&gt;printf&lt;/span&gt; &lt;span style="color:#a3be8c"&gt;&amp;#34;%s,%s&amp;#34;&lt;/span&gt; $temp_text $time_text &amp;gt; $log_file
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Of note, and that I haven&amp;rsquo;t already discussed:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;weather_text&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#bf616a"&gt;`&lt;/span&gt;curl &lt;span style="color:#81a1c1"&gt;-&lt;/span&gt;s &lt;span style="color:#a3be8c"&gt;&amp;#34;https://api.openweathermap.org/data/2.5/weather?lat=-33.93&amp;amp;lon=118.00&amp;amp;appid=somegiantrandomUIDtypenumber&amp;amp;units=metric&amp;#34;&lt;/span&gt;&lt;span style="color:#bf616a"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;curl&lt;/code&gt; basically sends out a network request the same as if you had typed it into the top of your browser. If it was a web page, it would return the text of the HTML, but in this case it returns the JSON I showed before - although less formatted.&lt;/p&gt;
&lt;p&gt;weather_text is a variable to which we are assigning the return value of the curl - ie the string of JSON. Note the backticks `` the curl is enclosed in. This is how the script knows to execute the command and assign the results rather than assigning some text beginning with &lt;code&gt;curl&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;temp_text=`echo $weather_text | awk -F&amp;#39;&amp;#34;temp&amp;#34;:&amp;#39; &amp;#39;{print $2}&amp;#39; | cut -d&amp;#39;,&amp;#39; -f1`
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Oh man, this took me on a journey. Firstly, keep in mind I&amp;rsquo;ve prettified the JSON above, actually the string looked like this, so it wasn&amp;rsquo;t possible to process it on a line by line.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&amp;#34;coord&amp;#34;:{&amp;#34;lon&amp;#34;:118,&amp;#34;lat&amp;#34;:-33.93},&amp;#34;weather&amp;#34;:[{&amp;#34;id&amp;#34;:803,&amp;#34;main&amp;#34;:&amp;#34;Clouds&amp;#34;,&amp;#34;description&amp;#34;:&amp;#34;broken clouds&amp;#34;,&amp;#34;icon&amp;#34;:&amp;#34;04d&amp;#34;}],&amp;#34;base&amp;#34;:&amp;#34;stations&amp;#34;,&amp;#34;main&amp;#34;:{&amp;#34;temp&amp;#34;:12.59,&amp;#34;feels_like&amp;#34;:11.68,&amp;#34;temp_min&amp;#34;:12.59,&amp;#34;temp_max&amp;#34;:12.59,&amp;#34;pressure&amp;#34;:1007,&amp;#34;humidity&amp;#34;:68,&amp;#34;sea_level&amp;#34;:1007,&amp;#34;grnd_level&amp;#34;:976},&amp;#34;visibility&amp;#34;:10000,&amp;#34;wind&amp;#34;:{&amp;#34;speed&amp;#34;:7.39,&amp;#34;deg&amp;#34;:307,&amp;#34;gust&amp;#34;:11.23},&amp;#34;clouds&amp;#34;:{&amp;#34;all&amp;#34;:64},&amp;#34;dt&amp;#34;:1682401802,&amp;#34;sys&amp;#34;:{&amp;#34;country&amp;#34;:&amp;#34;AU&amp;#34;,&amp;#34;sunrise&amp;#34;:1682375848,&amp;#34;sunset&amp;#34;:1682415263},&amp;#34;timezone&amp;#34;:28800,&amp;#34;id&amp;#34;:2070753,&amp;#34;name&amp;#34;:&amp;#34;Gnowangerup&amp;#34;,&amp;#34;cod&amp;#34;:200}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We are assigning to the variable &lt;code&gt;temp_text&lt;/code&gt; the contents of this command, where $weather_text is the JSON string.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;echo $weather_text | awk -F&amp;#39;&amp;#34;temp&amp;#34;:&amp;#39; &amp;#39;{print $2}&amp;#39; | cut -d&amp;#39;,&amp;#39; -f1
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The vertical lines are called &lt;em&gt;pipes&lt;/em&gt; &lt;code&gt;|&lt;/code&gt; they send the output of the command on their left into the command to their right. So there&amp;rsquo;s three different things happening here. The &lt;code&gt;echo&lt;/code&gt; just outputs the JSON, then we process it twice more.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;awk -F&amp;#39;&amp;#34;temp&amp;#34;:&amp;#39; &amp;#39;{print $2}&amp;#39;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;a href="https://www.geeksforgeeks.org/awk-command-unixlinux-examples/"&gt;awk&lt;/a&gt; is one of the great text processing commands along with &lt;code&gt;grep&lt;/code&gt; and &lt;code&gt;sed&lt;/code&gt;. The way it is being used here is to break the string into multiple parts, where the parts are delimited by the text &lt;code&gt;&amp;quot;temp&amp;quot;:&lt;/code&gt; which in our case is just two parts. Then we are outputting the second part ready for the next processing. So at this stage, the text would look like this.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;12.59,&amp;#34;feels_like&amp;#34;:11.68,&amp;#34;temp_min&amp;#34;:12.59,&amp;#34;temp_max&amp;#34;:12.59,&amp;#34;pressure&amp;#34;:1007,&amp;#34;humidity&amp;#34;:68,&amp;#34;sea_level&amp;#34;:1007,&amp;#34;grnd_level&amp;#34;:976},&amp;#34;visibility&amp;#34;:10000,&amp;#34;wind&amp;#34;:{&amp;#34;speed&amp;#34;:7.39,&amp;#34;deg&amp;#34;:307,&amp;#34;gust&amp;#34;:11.23},&amp;#34;clouds&amp;#34;:{&amp;#34;all&amp;#34;:64},&amp;#34;dt&amp;#34;:1682401802,&amp;#34;sys&amp;#34;:{&amp;#34;country&amp;#34;:&amp;#34;AU&amp;#34;,&amp;#34;sunrise&amp;#34;:1682375848,&amp;#34;sunset&amp;#34;:1682415263},&amp;#34;timezone&amp;#34;:28800,&amp;#34;id&amp;#34;:2070753,&amp;#34;name&amp;#34;:&amp;#34;Gnowangerup&amp;#34;,&amp;#34;cod&amp;#34;:200}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then I need to do the same sort of thing again - split the string using a delimiter, and just keep the part with the termperature in it. This time we&amp;rsquo;ll use a comma , as the delimiter, and only keep the part in front of it.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;cut -d&amp;#39;,&amp;#39; -f1
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We&amp;rsquo;re saying cut this string in to bits were the delimiter &lt;code&gt;-d&lt;/code&gt; is a comma, then output the first field.&lt;/p&gt;
&lt;p&gt;You might be wondering why I didn&amp;rsquo;t just use &lt;code&gt;awk&lt;/code&gt; again - I could have, but &lt;code&gt;cut&lt;/code&gt; is simpler. The reason I didn&amp;rsquo;t use &lt;code&gt;cut&lt;/code&gt; both times is that it can only take a single character as a delimiter. In fact, the first version I wrote of this script only used &lt;code&gt;cut&lt;/code&gt;, and I had the delimiters as colon for the first cut and comma for the second. As I was writing it, I was thinking that I should stress in the blog post about it that it was quite fragile - a small change in the JSON (for example adding a field, or changing the order - both things that would not cause a problem to a good Swift or JS JSON library) would break it. Then the weather changed and so was two layers of clouds, and the script broke and output the time as &lt;code&gt;{&amp;quot;all&amp;quot;&lt;/code&gt; instead of a number.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-04-25-at-3.00.13-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;printf&lt;/code&gt; just outputs the two values - temperature and timestamp as plain text with a comma between them into a text file that&amp;rsquo;s in the root of the Nginx webserver.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.iankulin.com/images/screen-shot-2023-04-25-at-8.06.34-pm.png"&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-04-25-at-8.06.34-pm.png" width="794" alt=""&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Now that&amp;rsquo;s in place, I just edited &lt;code&gt;/etc/crontab&lt;/code&gt; to have the new script run every five minutes to update the file with the temperature and timestamp.&lt;/p&gt;
&lt;h3 id="server-temp-logging"&gt;Server Temp Logging&lt;/h3&gt;
&lt;p&gt;We&amp;rsquo;ve already seen most of this, but I&amp;rsquo;ve made a couple of additions.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#5e81ac;font-style:italic"&gt;#!/bin/bash
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#616e87;font-style:italic"&gt;#check drivetemp has been loaded - needed for ssd temp&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#81a1c1;font-weight:bold"&gt;if&lt;/span&gt; ! lsmod &lt;span style="color:#eceff4"&gt;|&lt;/span&gt; grep -wq drivetemp&lt;span style="color:#eceff4"&gt;;&lt;/span&gt; &lt;span style="color:#81a1c1;font-weight:bold"&gt;then&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; modprobe drivetemp
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#81a1c1;font-weight:bold"&gt;fi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#616e87;font-style:italic"&gt;#collect the temp data&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;pch_name&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;cat /sys/class/hwmon/hwmon0/name&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;pch_temp&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;cat /sys/class/hwmon/hwmon0/temp1_input&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;cpu_name&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;cat /sys/class/hwmon/hwmon1/name&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;cpu_temp&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;cat /sys/class/hwmon/hwmon1/temp1_input&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;ssd_name&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;cat /sys/class/hwmon/hwmon2/name&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;ssd_temp&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;cat /sys/class/hwmon/hwmon2/temp1_input&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#616e87;font-style:italic"&gt;#this should contain the current outside temp and unix time&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;outside_temp&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;curl -s &lt;span style="color:#a3be8c"&gt;&amp;#34;https://iankulin.com/gnp_temp.txt&amp;#34;&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;`&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;log_file&lt;span style="color:#81a1c1"&gt;=&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;&amp;#34;/var/log/temps.csv&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#616e87;font-style:italic"&gt;# Print the temperatures to a log file&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#81a1c1"&gt;printf&lt;/span&gt; &lt;span style="color:#a3be8c"&gt;&amp;#34;&lt;/span&gt;&lt;span style="color:#81a1c1;font-weight:bold"&gt;$(&lt;/span&gt;date +&lt;span style="color:#a3be8c"&gt;&amp;#39;%d/%m/%Y,%T&amp;#39;&lt;/span&gt;&lt;span style="color:#81a1c1;font-weight:bold"&gt;)&lt;/span&gt;&lt;span style="color:#a3be8c"&gt;,%s,%d,%s,%d,%s,%d,out,%s\n&amp;#34;&lt;/span&gt; $pch_name $pch_temp $cpu_name $cpu_temp $ssd_name $ssd_temp $outside_temp &amp;gt;&amp;gt; $log_file
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We&amp;rsquo;ve already discussed how the curl works - this one is picking up the script we wrote to run on the VPS earlier. More interesting is checking for the &lt;code&gt;drivetemp&lt;/code&gt; module.&lt;/p&gt;
&lt;p&gt;The drivetemp module needs to be loaded into the Linux kernel before we can read the SSD temperature with the line.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;ssd_temp=`cat /sys/class/hwmon/hwmon2/temp1_input`
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Once it&amp;rsquo;s loaded, it stays there, unless computer is shutdown for any reason. There&amp;rsquo;s a &lt;a href="https://www.baeldung.com/linux/run-script-on-startup"&gt;number of places&lt;/a&gt; we can execute things on startup, but really this &lt;code&gt;drivetemp&lt;/code&gt; module is only needed for this script, so we should do it here. As far as I can make out, telling Linux to load a module that&amp;rsquo;s already loaded does not do any harm, and at once every five minutes it&amp;rsquo;s hardly going to cause a performance issue. Nevertheless, some sort of programmer ethics compels me to only do it if its needed.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#616e87;font-style:italic"&gt;#check drivetemp has been loaded - needed for ssd temp&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#81a1c1;font-weight:bold"&gt;if&lt;/span&gt; &lt;span style="color:#81a1c1"&gt;!&lt;/span&gt; lsmod &lt;span style="color:#81a1c1"&gt;|&lt;/span&gt; grep &lt;span style="color:#81a1c1"&gt;-&lt;/span&gt;wq drivetemp&lt;span style="color:#eceff4"&gt;;&lt;/span&gt; then
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; modprobe drivetemp
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;fi
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;lsmod&lt;/code&gt; returns a list of the loaded modules, this is passed to the &lt;code&gt;grep&lt;/code&gt;. &lt;code&gt;grep&lt;/code&gt; looks through lines of input and usually returns any lines that match. However in this case, we&amp;rsquo;re using the &lt;code&gt;-q&lt;/code&gt; (quiet) option. With this option on, instead of lines of text, you get nothing on the standard output, instead it sets the exit code to 0 (true) if it&amp;rsquo;s found, or 1 (false) if not.&lt;/p&gt;
&lt;p&gt;Since I&amp;rsquo;m interested in only running the &lt;code&gt;modprobe&lt;/code&gt; if &lt;code&gt;drivetemp&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; found, I have to negate the result of the &lt;code&gt;grep&lt;/code&gt; with &lt;code&gt;!&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;After that, all the temperature data is collected, then written out to a log fie for later processing.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-04-25-at-9.00.47-pm.jpg" alt=""&gt;&lt;/p&gt;
&lt;h3 id="the-results"&gt;The Results&lt;/h3&gt;
&lt;p&gt;Here&amp;rsquo;s 24 hours of the five minute temperature logs. For each server I averaged the three different temperatures (PCH, CPU core, and SSD drive) and graphed them along with the outside temperature from OpenWeather. &lt;code&gt;pve-prod1&lt;/code&gt; is the only one doing any real work here. It hosts my Jellyfin media server on a VM, and another VM with a collection of utilities such as Uptime Kuma. The Y axis is degrees centigrade.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/20230427-server-temps.jpg" alt=""&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/img_4315.jpg" alt=""&gt;&lt;/p&gt;
&lt;p&gt;The spike in &lt;code&gt;pve-dev&lt;/code&gt;1 at 2100 was caused by me stress testing one core to 100% load for ten minutes. I think I can see &lt;code&gt;pve-prod2&lt;/code&gt; (which sits directly on top of &lt;code&gt;pve-dev1&lt;/code&gt;) warming up a little as well. But strangely, and perhaps I&amp;rsquo;m imagining it, it seems like &lt;code&gt;pve-prod1&lt;/code&gt; (which sits on top of the stack) was a bit cooler in that time?&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t remember if I watched some TV between 6 and 8pm, but it looks like I did, and the spike at 2am will be the nightly snapshots being taken and sent off to the NAS.&lt;/p&gt;
&lt;p&gt;You can see that &lt;code&gt;pve-prod2&lt;/code&gt; and &lt;code&gt;pve-dev1&lt;/code&gt; were turned on to run this test, and it takes about 40 minutes for them to warm up. It&amp;rsquo;s interesting to notice the bigger amplitude of the production machine compared to the others just idling. And also interesting that &lt;code&gt;pve-dev1&lt;/code&gt; (which wasn&amp;rsquo;t running any load till I ran the stress test on it) was just generally warmer that &lt;code&gt;pve-prod1&lt;/code&gt; which was running a small work load.&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t remember if I watched some TV between 6 and 8pm, but it looks like I did, and the spike at 2am will be the nightly snapshots being taken and sent off to the NAS.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s have a look at pve-dev1 while the stress test was running.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/pve-dev1-temp.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;It makes sense that the PCH which is mm away from, and directly connected to, the CPU would warm up as the CPU was hammered with square root calculations, and since the drive temp is a up a little so I guess that reflects the ambient temperature inside the case.&lt;/p&gt;
&lt;p&gt;The CPU temperature hadn&amp;rsquo;t plateaued yet, so it might be interesting to run it until it does one day and see what that looks like.&lt;/p&gt;</description></item><item><title>Recursive list of files in Linux</title><link>https://blog.iankulin.com/recursive-list-of-files-in-linux/</link><pubDate>Wed, 08 Mar 2023 00:00:00 +0000</pubDate><guid>https://blog.iankulin.com/recursive-list-of-files-in-linux/</guid><description>&lt;p&gt;I&amp;rsquo;ve spent a few hours over the weekend migrating a media library from an external USB drive to the NAS, and in the process reorganised it, and in many cases bulk changed file names. I&amp;rsquo;ve also added a heap of metadata.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;d like to check that I haven&amp;rsquo;t missed any files, but a side by side listing of each data source won&amp;rsquo;t do the trick, so I&amp;rsquo;ll probably end up pulling the data into a spreadsheet, but I&amp;rsquo;d like to get as close as possible with Linux-fu first.&lt;/p&gt;
&lt;p&gt;Before I go over my trial and error, and eventual solution, here&amp;rsquo;s how I&amp;rsquo;ve set up my test data for the examples. I thought I&amp;rsquo;d better start with something simple and small for testing commands.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-5.02.17-pm.png"&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-5.02.17-pm.png" width="495" alt=""&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This is actually the output of the &lt;code&gt;tree&lt;/code&gt; command on a &lt;code&gt;test&lt;/code&gt; directory I&amp;rsquo;ve created in my home directory. (I had to install it - &lt;code&gt;sudo apt install tree&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;What I need to end up with is something that recursively lists all the files, with one file per line, and it needs to include the directory tree to reach it. I should be able to pipe it through something to ignore lines that are just directories (and any other fluff).&lt;/p&gt;
&lt;h3 id="ls"&gt;ls&lt;/h3&gt;
&lt;p&gt;My go to for listing files is &lt;code&gt;ls -all&lt;/code&gt;, perhaps than can help us? It lists one line per file (along with permissions etc), so if we add &lt;code&gt;-R&lt;/code&gt; for recursive, that could be it. Here&amp;rsquo;s the output for &lt;code&gt;ls -all -R test&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;total 16
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;drwxr-xr-x 4 ian ian 4096 Mar 6 16:36 .
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;drwxr-xr-x 4 ian ian 4096 Mar 6 16:36 ..
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;drwxr-xr-x 2 ian ian 4096 Mar 6 17:01 dir1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;drwxr-xr-x 2 ian ian 4096 Mar 6 17:01 dir2
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir1:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;total 8
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;drwxr-xr-x 2 ian ian 4096 Mar 6 17:01 .
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;drwxr-xr-x 4 ian ian 4096 Mar 6 16:36 ..
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 17:01 ignore.me
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 17:00 media1.ex1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 17:00 media1.ex2
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 17:01 media3.ex1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 16:36 somefile
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 16:36 somefile2
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir2:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;total 8
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;drwxr-xr-x 2 ian ian 4096 Mar 6 17:01 .
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;drwxr-xr-x 4 ian ian 4096 Mar 6 16:36 ..
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 17:01 ignore.me
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 17:01 media4.ex1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 17:01 media5.ex1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 17:01 media6.ex2
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 16:37 somefile
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;-rw-r--r-- 1 ian ian 0 Mar 6 16:37 somefile3
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;So we get one line per file, but the directory is on it&amp;rsquo;s own at the beginning of each directory listing.&lt;/p&gt;
&lt;h3 id="find"&gt;find&lt;/h3&gt;
&lt;p&gt;Based on &lt;a href="https://www.cyberciti.biz/faq/how-to-show-recursive-directory-listing-on-linux-or-unix/"&gt;this post&lt;/a&gt;, there is a command, &lt;code&gt;find&lt;/code&gt;, that might do what we want. The simple version would be &lt;code&gt;find test&lt;/code&gt; (remember test is the directory name).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;ian@vm102-jellyfin:~$ find test
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir2
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir2/ignore.me
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir2/media4.ex1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir2/somefile
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir2/media5.ex1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir2/somefile3
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir2/media6.ex2
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir1/media3.ex1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir1/ignore.me
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir1/somefile
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir1/media1.ex1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir1/media1.ex2
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;test/dir1/somefile2
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Well that is real close, but there&amp;rsquo;s no way to discern between a file and directory. In that same post, it;s suggested to use the &lt;code&gt;-ls&lt;/code&gt; option to see some more detail. Let&amp;rsquo;s try find &lt;code&gt;test -ls&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-5.20.59-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;This looks pretty close. If there was someway of using that &amp;rsquo;d&amp;rsquo; in the first position of the permissions output to eliminate those lines, we&amp;rsquo;d be well on our way. I have a feeling this is a &lt;code&gt;grep&lt;/code&gt; question. I have some basic grep, so for example I know I could pull all of those directories with &lt;code&gt;find test -ls | grep ' d'&lt;/code&gt;, or even invert it with the &lt;code&gt;-v&lt;/code&gt; flag to get just the files (which is out eventual goal).&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-5.36.37-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;However, this is pretty hacky. A space followed by a lowercase could easily occur in a filename. What I really need to do is look at just that column which I think is character number 18. Off to &lt;a href="https://unix.stackexchange.com/questions/32170/find-all-lines-in-a-file-with-a-certain-character-at-a-certain-position"&gt;Stack Exchange&lt;/a&gt; I guess&amp;hellip;&lt;/p&gt;
&lt;h3 id="grep-with-regex"&gt;grep with regex&lt;/h3&gt;
&lt;p&gt;Okay, it turns out we can use regex with grep. I&amp;rsquo;m no expert in that either, but in regex the caret ^ represents the start of the line, a fullstop represents any character, and we can repeat that however many times we want by following it with a number in (escaped) curly braces. Something like &lt;code&gt;'^.{17}d'&lt;/code&gt; should do it.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-5.44.46-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;Okay! We&amp;rsquo;re getting close. I also want to ignore all the metadata and just see the media files. This can be determined by the extensions - probably .avi .mp4 .mkv .mv4. With this test data, we&amp;rsquo;ll pretend it&amp;rsquo;s .ex1 and .ex2&lt;/p&gt;
&lt;h3 id="combining-grep-tests-with-logical-or"&gt;Combining grep tests with logical or&lt;/h3&gt;
&lt;p&gt;I guess I could build some sort of super regex combined with the first one, but I&amp;rsquo;m only dealing with thousands of files, not millions so the extra overhead of piping through another grep is not going to be a drama, and I can simplify my work. In the same way that the caret ^ marks the start of a line, the dollar $ marks the end of it. So to just get the .ex1 files something like &lt;code&gt;'\.ex1$'&lt;/code&gt; should do it. The backslash at the start is to escape the period, because here we want that to mean a literal full stop, and not a wildcard for any character.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-5.55.25-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;Nice, but remember I&amp;rsquo;ve got a big list of extensions, so I need to logical or a few together. This is done by putting the expressions with a pipe between them. I had a couple of goes at this with no luck and that familiar feeling of being out of my depth with regex. However, there&amp;rsquo;s a grep way out of this, because the grep flag -e allows us to OR matching expressions.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-6.06.52-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;We&amp;rsquo;re definitely getting somewhere. You might think at this point that the chance of a directory name ending in .mp4 or one of the other media extensions is no low we could ignore it, and you&amp;rsquo;d probably be right. But as a matter of programmer pride, I never like to leave a future problem, so I&amp;rsquo;ll be keeping the directory rejecting grep. So now my command looks like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#d8dee9;background-color:#2e3440;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;find test -ls | grep -v &amp;#39;^.{17}d&amp;#39; | grep -e &amp;#39;\.ex1$&amp;#39; -e &amp;#39;\.ex2$&amp;#39;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Any experienced regex people would be pointing out the match for .ex1 .ex2 can easily be merged into a simple expression, but remember when I do this for real I&amp;rsquo;ve got a list of more complex extensions to test for.&lt;/p&gt;
&lt;h3 id="cut"&gt;cut&lt;/h3&gt;
&lt;p&gt;All that text at the beginning of these lines is not needed. Surely I can trim that off somehow? Yep - there&amp;rsquo;s a command &lt;code&gt;cut&lt;/code&gt; that does exactly that. The -b flag specifies which byte to extract, and this can also be a range. putting a dash after the position number says to output all of the bytes after that position. So if we applied &lt;code&gt;cut -b 5-&lt;/code&gt; to the string &lt;code&gt;123456789&lt;/code&gt;, the output would be &lt;code&gt;56789&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-6.26.18-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;Bingo. Just one more problem. In my data, I have a heap of files with a valid extension but I want to exclude them based on their file name. Every directory with a movie has a trailer named &lt;code&gt;trailer.mp4&lt;/code&gt;, so I need to eliminate them. To simulate this, lets add in another extension with our test data.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-6.29.48-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;So I want to take out the lines that include &amp;lsquo;/ignore.me&amp;rsquo;. I should be able to do this with another &lt;code&gt;grep -v&lt;/code&gt; regex on a line end. Something like &lt;code&gt;grep -v 'ignore.me$'&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.iankulin.com/images/screen-shot-2023-03-06-at-6.33.22-pm.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;And we&amp;rsquo;re done! I&amp;rsquo;ll just direct this into a file, run it on both disks and pull them into Excel to separate the file names and directories, and sort them to compare.&lt;/p&gt;</description></item></channel></rss>