<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[tegud.net]]></title><description><![CDATA[Steve Elliott, Manchester based .NET, JavaScript developer, blogging about programming, DevOps, automation and other techy things.]]></description><link>https://www.tegud.net/</link><image><url>https://www.tegud.net/favicon.png</url><title>tegud.net</title><link>https://www.tegud.net/</link></image><generator>Ghost 2.9</generator><lastBuildDate>Mon, 14 Sep 2020 05:27:43 GMT</lastBuildDate><atom:link href="https://www.tegud.net/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Triggering Github Actions manually with the API]]></title><description><![CDATA[You can now trigger Github Workflows manually [https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/] , hooray! Rather bafflingly the announcement doesn't mention the API, but there is one! Hunting through the also recently announced OpenAPI Github.com API Specifications [https://github.com/github/rest-api-description/blob/main/descriptions/api.github.com/api.github.com.yaml] yields the POST - /repos/{owner}/{repo}/actions/workflows/{workflow_id}/dispat]]></description><link>https://ghost.tegud.net/triggering-github-actions-manually-with-the-api/</link><guid isPermaLink="false">Ghost__Post__5f226eb825ccbf0001dbce38</guid><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Sun, 13 Sep 2020 19:44:55 GMT</pubDate><content:encoded><![CDATA[<p>You can now trigger <a href="https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/">Github Workflows manually</a>, hooray! Rather bafflingly the announcement doesn't mention the API, but there is one! Hunting through the also recently announced <a href="https://github.com/github/rest-api-description/blob/main/descriptions/api.github.com/api.github.com.yaml">OpenAPI Github.com API Specifications</a> yields the <code>POST - /repos/{owner}/{repo}/actions/workflows/{workflow_id}/dispatches</code> endpoint.</p><p>Before you can use the endpoint you'll need a couple of things:</p><ul><li><strong>A Github API Token </strong>with the following permissions: </li><li><strong>The workflow_id</strong> - this is the numeric id, you can get it from the <code>GET - /repos/{owner}/{repo}/actions/workflows</code> endpoint</li></ul><h2 id="enable-manual-workflow-trigger">Enable manual workflow trigger</h2><p>First we need to tell our workflow to trigger on the manual event:</p><pre><code>on: workflow_dispatch:</code></pre><p>This will enable the button in the UI as well as enabling the API to trigger as well.</p><h2 id="call-the-workflow-dispatch-endpoint">Call the workflow dispatch endpoint</h2><pre><code>curl --location --request POST 'https://api.github.com/repos/tegud/example/actions/workflows/123456/dispatches' \ --header 'Authorization: Bearer <MY TOKEN>' \ --header 'Content-Type: application/json' \ --data-raw '{ "ref": "master" }'</code></pre><p>That's pretty much it. Nice and easy.</p><p>You can provide optional input's as well, which introduces the possibility of workflows calling workflows - in fact someone's already created a Github Action for that: <a href="https://github.com/marketplace/actions/workflow-dispatch">https://github.com/marketplace/actions/workflow-dispatch</a> </p>]]></content:encoded></item><item><title><![CDATA[Using Github Actions Securely]]></title><description><![CDATA[Github have recently released Actions, much like CircleCI and Travis, it makes it really easy to get up and running to Build, Test and Deploy your application. Unlike CircleCI and Travis, Actions allow you to easily reuse Actions via the Marketplace. This opens up some really nice ways of reusing common functionality, but much like Docker images, chef recipes, npm packages or similar, you have to be conscious of the source of the component, and whether it's safe to bring in to your application]]></description><link>https://ghost.tegud.net/using-github-actions-securely/</link><guid isPermaLink="false">Ghost__Post__5d83f62036206e0001f7f1c1</guid><category><![CDATA[security]]></category><category><![CDATA[continuous integration]]></category><category><![CDATA[github-actions]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Sat, 08 Feb 2020 09:27:13 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2020/02/225007582_cc34295708_h.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2020/02/225007582_cc34295708_h.jpg" alt="Using Github Actions Securely"/><p>Github have recently released Actions, much like CircleCI and Travis, it makes it really easy to get up and running to Build, Test and Deploy your application. Unlike CircleCI and Travis, Actions allow you to easily reuse Actions via the Marketplace. This opens up some really nice ways of reusing common functionality, but much like Docker images, chef recipes, npm packages or similar, you have to be conscious of the source of the component, and whether it's safe to bring in to your application/environment.</p><p>CI/CD tools are trusted with some of the most critical secrets, AWS keys, source control access. Not to mention that they can be used to inject compromised code into built artefacts. Security really needs to be considered when re-using Actions. Always check:</p><!--kg-card-begin: html--><ul> <li>The source (<a href="https://github.com/marketplace?type=actions&verification=verified" target="_blank">Actions don't seem to support verification yet</a>)</li> <li>You can link the Action to a version of the code</li> <li>Any Docker containers used are also safe</li> <li>You can see the code and ensure it does what it should do</li> </ul><!--kg-card-end: html--><p>And you should check this every version upgrade - because you really don't want to risk bringing in dangerous code into your Workflow.</p><p>Still despite all these checks, the way Github Releases work (and the Marketplace), you could still have an issue.</p><h2 id="third-party-actions-versioning-vulnerable">Third Party Actions - Versioning Vulnerable</h2><p>One of the major differences of Github Actions is the support for reusable actions. These package up common tasks, so you can include and reference them easily. They're versioned and available for everyone to find in Github's Marketplace. They can do anything from building code, linting, testing, to deployment. </p><p>But the recommended way of including actions is currently a massive security vulnerability of your Action Workflow.</p><p>Take my action: <a href="https://github.com/marketplace/actions/serverlesscli">https://github.com/marketplace/actions/serverlesscli</a></p><pre><code> - name: serverless deploy uses: tegud/serverless-github-action@1.52.0 with: command: deploy env: AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}</code></pre><p>When included in your Workflow, with the correct secrets set, it will deploy your Serverless functions using the specified AWS credentials (don't commit your AWS keys folks).</p><p>But, there's a catch, I've specified I want to use version 1.52.0, I would expect this release to be immutable. If you want to see what happens when lots of people depend on an external dependency and for it to then disappear see the great NodeJS left pad debacle. But unfortunately, even worse than that is then allowing someone to replace the previous release with new code <em>with the same version</em>.</p><p>Which is possible with Github Actions.</p><ol><li>Create an action, release it to the market place (e.g. an-org/cool-action@v1)</li><li>Create workflow which utilises an-org/cool-action@v1 (it may or may not receive/require sensitive secrets such as AWS keys)</li><li>Commit new code to an-org/cool-action, which could: transmit secrets to external servers, modify code to inject vulnerabilities, track usage in ways not agreed by, ship code to external sources.</li><li>Delete release & tag v1 on an-org/cool-action</li><li>Create release using vulnerability injected code above with the same name/tag "v1"</li><li>Trigger workflow, attacking code within the action will be executed.</li></ol><figure class="kg-card kg-image-card"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2020/02/secret-api-key-logs.png" class="kg-image" alt="Using Github Actions Securely"/></figure><p>7. And now I have your secret password.</p><p>The scope of attacks here is somewhat concerning. With the Serverless CLI example above, we've got someone's AWS credentials, and probably with a decent level of access to boot. But it could be even worse, the workflow could easily inject comprised code into an application <em>before</em> deployment, and enable worse privacy/data breaches, or crypto mining. The CI/CD pipeline is a key part of application build and deployment, making it potentially devastating to exploit, and here Github Actions are setting a best practice that encourage users to open themselves up to this sort of attack.</p><p>Github's response is to version using SHA instead of version number, so from our example above again:</p><pre><code> - name: serverless deploy uses: tegud/serverless-github-action@089a3603204affdd86da79e742fa520a6fe16d69 with: command: deploy env: AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}</code></pre><p>This would be much harder to exploit. But that's not in their documentation, and it's not the examples you'll see in most the Marketplace either (it'd be remarkably hard to actually make it that the description in the market place as you don't know the SHA in advance).</p><p>It may be you can trust a Third Party Action vendor in the marketplace if they're "official" (Though it's early days, so there's no verification for official or not), but since a lot of these are open source project with multiple contributors, who take on additional contributors from the community, again is it worth the risk? </p><h2 id="so-don-t-use-actions-">So don't use Actions?!</h2><p>Github Actions are good, I'm using them, and I'm enjoying them overall, if you treat your CI/CD pipeline with the distrust it warrants (seriously, it's so important you know where the things you put in your pipeline come from!) then it's a serious alternative to CircleCI or Travis. </p><p>Hopefully Github will tighten up security around releases, and I can update this blog post in the future to say it's been fixed. But for now, be careful what actions you let into your workflows.</p>]]></content:encoded></item><item><title><![CDATA[Elasticsearch Manchester Meetup talk - 30 Jan 2020 - Entity Centric Indexing with Elasticsearch Transforms]]></title><description><![CDATA[Thanks to everyone who listened to my talk. I'll be turning the content in to a blog post (or two), but in the meantime the content is linked below. If you'd like to talk about Entity Centric Indexes or other things Elasticsearch you can find me on twitter [https://www.twitter.com/tegud]. Slides https://docs.google.com/presentation/d/1N0k8CARPkfNR2kJ7WDXHClHpyFAfdvmFuo_QQJELK6g/edit?usp=sharing Elasticsearch Transforms Demo Application Demo Application I built to manage the transforms and ]]></description><link>https://ghost.tegud.net/entity-centric-indexing-with-elasticsearch-transforms/</link><guid isPermaLink="false">Ghost__Post__5e34a593bfc3810001ca5a81</guid><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Fri, 31 Jan 2020 22:24:30 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2020/01/0.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2020/01/0.jpg" alt="Elasticsearch Manchester Meetup talk - 30 Jan 2020 - Entity Centric Indexing with Elasticsearch Transforms"/><p>Thanks to everyone who listened to my talk. I'll be turning the content in to a blog post (or two), but in the meantime the content is linked below. If you'd like to talk about Entity Centric Indexes or other things Elasticsearch you can <a href="https://www.twitter.com/tegud">find me on twitter</a>.</p><p><strong>Slides</strong></p><p><a href="https://docs.google.com/presentation/d/1N0k8CARPkfNR2kJ7WDXHClHpyFAfdvmFuo_QQJELK6g/edit?usp=sharing">https://docs.google.com/presentation/d/1N0k8CARPkfNR2kJ7WDXHClHpyFAfdvmFuo_QQJELK6g/edit?usp=sharing</a></p><p><strong>Elasticsearch Transforms Demo Application</strong></p><p>Demo Application I built to manage the transforms and generate demonstration data for the Scenarios I referenced.</p><p><a href="https://github.com/tegud/dataframe-demos">https://github.com/tegud/dataframe-demos</a></p><p><strong>Compose (The JavaScripty one)</strong></p><p><a href="https://github.com/tegud/SENTINEL.Composer.Builder">https://github.com/tegud/SENTINEL.Composer.Builder</a><br><a href="https://github.com/tegud/SENTINEL.Composer.Store">https://github.com/tegud/SENTINEL.Composer.Store</a></br></p><p><em>Disclaimer: The code is 3-5 years old, and I honestly haven't touched it in 3 years. Hopefully it'll give an idea of how it can be achieved</em></p>]]></content:encoded></item><item><title><![CDATA[Checkless & Logz.io - Free & Easy uptime history]]></title><description><![CDATA[Since the very first version of Checkless, I wanted to keep it simple and cheap (preferably free). So to begin with, the only two notifiers were Slack and Email, this meant there wasn't much in the way of reports or check history. This was fine, as it still told me when things were wrong within a reasonable time frame. But it would be nice to: * Have check result history * Have some nice visualisations * Have more options for notifications Whilst maintaining the key design goals (Simple ]]></description><link>https://ghost.tegud.net/checkless-getting-free-easy-check-history/</link><guid isPermaLink="false">Ghost__Post__5c8ea50a8205f0000108bff4</guid><category><![CDATA[checkless]]></category><category><![CDATA[serverless]]></category><category><![CDATA[elasticsearch]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Tue, 19 Mar 2019 08:30:00 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/Screen-Shot-2019-03-17-at-20.11.56.png" medium="image"/><content:encoded><![CDATA[<img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/Screen-Shot-2019-03-17-at-20.11.56.png" alt="Checkless & Logz.io - Free & Easy uptime history"/><p>Since the very first version of Checkless, I wanted to keep it simple and cheap (preferably free). So to begin with, the only two notifiers were Slack and Email, this meant there wasn't much in the way of reports or check history. This was fine, as it still told me when things were wrong within a reasonable time frame. But it would be nice to:</p><ul><li>Have check result history</li><li>Have some nice visualisations</li><li>Have more options for notifications</li></ul><p>Whilst maintaining the key design goals (Simple & Free/Cheap). </p><p>Given I was already using AWS Lambda, my mind jumped first to DynamoDB as a storage mechanism - this is cheap, but it's not terribly simple (at least not to me, yet), and this particular use case isn't well documented (multiple keys/dimensions, almost time series data). Being so focused on AWS based technologies, I upsettingly neglected a technology I was very familiar with: The Elastic Stack.</p><h2 id="cheap-elasticsearch">Cheap, Elasticsearch?</h2><p>Elasticsearch isn't generally associated with being cheap, its memory constraints are quite demanding for a start. Elastic Cloud has helped a bit - you can get a single node for ~£15/mo, but this all started because I didn't want to pay Pingdom a tenner a month, so given I didn't already have an Elasticsearch instance, that didn't seem like a particularly appealing option.</p><p>However, in one of my recent roles I encountered <a href="https://www.logz.io">Logz.io</a>, a hosted Elastic Stack platform, with a community option, that's free! You don't get a huge amount of retention - 3 days, but it fulfils the primary design goals: Cheap & Simple, and it's Elasticsearch/Kibana, which meant I knew it well.</p><h2 id="getting-the-data-in">Getting the data in</h2><p>Logz.io (as does Logstash) supports sending data in via HTTP(s) endpoint. This is pretty much exactly how the slack notifier works, except it's not even necessary to format the data, we just send in the check result as JSON in the request body.</p><p>I created a new version of Checkless (v2.1.0) and Checkless-CLI (v1.11.1) that can send and configure check results to a webhook. This allows the data to be shipped to logz.io (and potentially a wide range of other targets as well). Using checkless-cli, it's just a matter of adding the notifier to the <code>checkless.yml</code> file:</p><pre><code>notifications: - webhook: webhookUrl: '${env:CHECKLESS_LOGZIO_WEBHOOK_PATH}'</code></pre><p>And the correct serverless config will be generated to always send the result to your logz.io endpoint. </p><p><em>env:CHECKLESS_LOGZIO_WEBHOOK_PATH</em> is a CircleCI environment variable containing the logz.io log shipper HTTPS endpoint. Once you've signed up to Logz.io you can go to the Log Shipping tab -> Libraries -> Bulk HTTP/S, to get the version for your account.</p><p>As soon as this is deployed you should start to see events in your Discover tab:</p><figure class="kg-card kg-image-card"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/Screen-Shot-2019-03-18-at-23.22.16-1.png" class="kg-image" alt="Checkless & Logz.io - Free & Easy uptime history"/></figure><h2 id="visualising">Visualising</h2><p>With logz.io you get Kibana which gives you huge flexibility to create visualisations and dashboards. Once the data's in it's really easy to create overview dashboards of all checks, or of specific websites:</p><figure class="kg-card kg-image-card"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/Screen-Shot-2019-03-18-at-21.55.59.png" class="kg-image" alt="Checkless & Logz.io - Free & Easy uptime history"/></figure><p>This basic dashboard shows Success:Failure ratio as well as the median time to first byte. In this case it's monitoring this website, which was recently moved to Netlify. This already tells me two things:</p><ul><li>The US-East-1 region has occasional connectivity issues to Netlify's CDN</li><li>The Netlify CDN has much better routing for US than London, 3x better Time to First Byte</li></ul><p>So I'm going to trial switching the probe to US-East-2 region to see if it has better reliability, as well as investigate Netlify's CDN options to see if I can improve the UK based TTFB.</p><p>It'd be nice to have parameterised dashboards (so you don't have to save a dashboard per check), such as you can with Grafana, but for the slight overhead and the fact I don't have to maintain my own Elasticsearch server, I can cope!</p><p><strong>Note:</strong> I've submitted my example checkless dashboard to Logz.io's contributions, I'll update this blog post with a link when it's live!</p><h2 id="alerting">Alerting</h2><p>Checkless's alerts to email/Slack work solely on individual events. If there's a failure - then alert. It's impossible to do alert on 3 failures in a set period or similar without something storing state. Thankfully, with Logz.io we can alert on a set no. failures within a given period. </p><figure class="kg-card kg-image-card"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/Screen-Shot-2019-03-18-at-22.26.21.png" class="kg-image" alt="Checkless & Logz.io - Free & Easy uptime history"/></figure><p>There's still potential for a more configurable alert - no. failures in a row, no. of probes in a failing state, but it's a definite advancement. Like the pure AWS implementation the alerts can be sent via Email, slack, or anything that can be triggered by webhook.</p><p>In additional to alert on failures, I've also configured a slack alert to warn me when there has been no check results for a set period (e.g. 10 minutes), this indicates to me when I've outright broken Checkless. When I first started Checkless, I enabled slack to output success as well as failure so that I could <em>see</em> it working, and so I could be sure that the site's were up, and that it wasn't that Checkless wasn't running. With this alert in place, I can have confidence to know Checkless is working, and disable the rather verbose slack check success output. If I don't have any Checkless events in logz.io in 10 minutes, I know I'll get an alert.</p><h2 id="conclusion">Conclusion</h2><p><a href="https://www.logz.io/">Logz.io</a>'s community tier is great for the Open Source Community - it's not a product a lot of their competitors offer, and it's enabled me to gain some history and analytics for my site check already, which immediately allowed me to spot some trends I hadn't seen with it just being a stream of slack messages. </p><p>I still plan on looking into Dynamo DB as storage backend in the future - as I think that offering multiple options is a good thing for the project, but also that some other alerting/reporting use cases can be satisfied with a storage backend such as DynamoDB. But for now - combining Logz.io with Checkless has been really easy - and a massive benefit.</p>]]></content:encoded></item><item><title><![CDATA[Static Ghost - (with Gatsby & Netlify)]]></title><description><![CDATA[I've used ghost as my blogging platform for years now. It's simple, requires minimal setup, and it's dead easy to theme. It's never been the easiest to integrate with anything else, lacking even basic webhooks or API support, but recently this has begun to change. The new Admin & Content APIs allow easy access to all the Ghost Content, which opens options for external UI's based on React, static sites and other integrations. The possibility of converting my ghost blogs public hosting to a st]]></description><link>https://ghost.tegud.net/ghost-gatsby/</link><guid isPermaLink="false">Ghost__Post__5c7c1da97a4ee400019aa3a9</guid><category><![CDATA[ghost]]></category><category><![CDATA[gatsby]]></category><category><![CDATA[serverless]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Mon, 04 Mar 2019 22:45:15 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/static-ghost.JPG" medium="image"/><content:encoded><![CDATA[<img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/static-ghost.JPG" alt="Static Ghost - (with Gatsby & Netlify)"/><p>I've used ghost as my blogging platform for years now. It's simple, requires minimal setup, and it's dead easy to theme. It's never been the easiest to integrate with anything else, lacking even basic webhooks or API support, but recently this has begun to change. The new Admin & Content APIs allow easy access to all the Ghost Content, which opens options for external UI's based on React, static sites and other integrations.</p><p>The possibility of converting my ghost blogs public hosting to a static site really appealed to me, so that's where I started.</p><h2 id="but-why">But Why?</h2><figure class="kg-card kg-image-card"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/ryan_reynolds_why.gif" class="kg-image" alt="Static Ghost - (with Gatsby & Netlify)"/></figure><p>So why is having a static site better than just running off of Ghost? Well it might not be for everyone, but I could immediately see several benefits for me:</p><ul><li><strong>Resilience</strong> - A standard ghost install will require it's database and NodeJS app to run 100% of the time to serve content. Scaling requires both these elements to be scalable, or aggressive caching. With a static site, scalability is much, much easier. In fact a lot of platforms will take care of it for you, such as an S3 bucket or Netlify. </li><li><strong>Security</strong> - There's not much more secure than a static site. No NPMs to keep up to date, no NodeJS vulnerabilities. Your choice of deployment/host may include keeping a web server up to date, but there's much less to worry about than with an application and database. </li><li><strong>Performance</strong> - This will vary by your choice of static site builder, and implementation, but whatever your choice, there's no NodeJS overhead OR database query time. </li><li><strong>Portability</strong> - A static site can be hosted in a huge range of places - S3 buckets, Github pages, platforms like Netlify. You don't need to worry about hosting containers, or NodeJS, versions or anything like that, if it can host the static site, you're good to go!</li><li><strong>Affordability</strong> - Instead of needing ghost to run all the time, you can start it when you need to edit posts/site content. This means you wouldn't need the VM running ghost and it's DB only when it's needed, not 100% of the time.</li></ul><p>The general theme, as you can probably tell is: <strong>There's much less to manage.</strong></p><h2 id="building-gatsby">Building: Gatsby</h2><p>Gatsby is the example provided by Ghost for usage of their Content API (<a href="https://docs.ghost.org/api/gatsby/">https://docs.ghost.org/api/gatsby/</a>), so I gave it a go first.</p><p>Gatsby is a tool for creating static sites using React. It has a powerful plugin architecture, which amongst other things, allows you to plugin different data sources via GraphQL to generate the static site.</p><p>Ghost have provided one such plugin, and <em>just</em> about enough instructions to get up and running (to be fair to Ghost they make it clear that this use case is very new, and better documentation is coming). </p><p>To get going execute: </p><p><code>gatsby new gatsby-starter-ghost https://github.com/TryGhost/gatsby-starter-ghost.git</code></p><p><code>yarn</code></p><p>This will give you a basic site with a default theme in react, which you can plugin to your ghost instance (either via environment variables or by <code>.ghost.json</code> file). To start developing just execute:</p><p><code>npm run dev</code> </p><p>And you'll get a local dev server (running on <code>localhost:8000</code>), allowing you to iterate on your design and build up your desired functionality. Additionally gatsby include a graphql explorer (<code>http://localhost:8000/___graphql</code>) to help explore the dataset for your static site - which is a really nice feature.</p><p>I had some issues getting certain things to play nice together, even starting with an exact clone of the getting started git repo, but in general it was pretty painless, if you have even limited experience of React (like me), you'll likely be able make your blog look the way you want it to (in my case I went for a like for like replacement of the existing theme).</p><p>Links:</p><ul><li><a href="https://github.com/tegud/tegud-static">My Static Site Repo</a></li><li><a href="https://github.com/TryGhost/gatsby-starter-ghost">Ghost Gatsby Starter</a></li></ul><h2 id="hosting-netlify">Hosting: Netlify</h2><p>Netlify is a platform for hosting static sites, you give it a github repo, and it sets up the deployments for you, deploying when the git repo is updated, setting up CDNs, HTTPs (via lets encrypt), everything. It makes it dead easy to get up and running really quickly, and for everything I needed, it was free!</p><p>Once going through the process of signing up and adding my github repo, all I had to do was some minimal config to tell Netlify how my application was built (`gatsby build`) and we're good to go (ghost helpfully provided a starter project, with an included netlify.toml file). Getting up and running with a gatsby build that had worked locally on one of netlify's default subdomains took less than 10minutes including signup!</p><p>Getting domains transferred and HTTPS setup for those was a little trickier because Netlify won't provision the HTTPS certificate until the DNS is fully propagated, which led to a rather upsetting period where the site was offline, stuck between DNS propagation and no HTTPS certificate on Netlify. But within 20minutes I was fully back up and running. </p><p>The final thing to do was to configure deployments when ghost content was updated.</p><p><em>(A co-worker initially told me about Netlify and when I first looked at it all I saw was the documentation about branching and similar, which put me off, but it turns out it's very possible to do trunk based development and triggers based on webhooks, which is exactly what I wanted)</em></p><h2 id="deployments">Deployments</h2><p>As mentioned Netlify is built to trigger deployments when the git repo is updated. This is great for when the theme is updated, or the how the data is gathered/mapped to graphql, but not so much for when ghost content gets updated. Thankfully Netlify also supports deployments triggered from webhooks.</p><p>But we need to trigger deployments when Ghost content is changed. Traditionally with Ghost this was REALLY hard. I've seen people hijack slack webhooks and similar in the past to get a webhook when a post was published/updated/etc. Thankfully with the Content & Admin APIs came proper webhook support as well. </p><p>First I configured Netlify to trigger deployments on a webhook, this providers you with a handy webhook url, which I then added to Ghost's integration section. To begin with I wasn't sure which hooks would fire when, and which ones I would need, but through usage I've determined the webhooks required are:</p><ul><li>Post unpublished</li><li>Published post updated</li><li>Post published</li></ul><p>This will cover any update to published content, anything else we don't care about as it won't make it to the static site.</p><p>Finally I took advantage of Netlify's build/deployment notifications to signal in slack when deployments were happening:</p><figure class="kg-card kg-image-card"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/netlify-notify.PNG" class="kg-image" alt="Static Ghost - (with Gatsby & Netlify)"/></figure><h2 id="did-it-work">Did it work?</h2><p>Yes! You're reading this on a static site, built on Gatsby, hosted on Netlify, a post that was written on my old ghost infrastructure. The deployments and webhooks were easy to set up, and I can make my blog look exactly how I want. Building on top of it as it's a React app should also be dead easy. Performance is improved (see below), and it should be super resilient! </p><p>What wasn't so good? Well, again, Ghost make it <em>really</em> clear that using the content API and Gatsby together is really new, and the documentation is light, but I did suffer some frustrations where the getting started repo didn't work on a straight clone, with some issues around the siteMetadata graphql types, but I <s>hacked</s> remapped around that. As well, the gatsby helpers aren't well developed yet (the tag helped outputs React warnings as well...), but I mention this more to reinforce that this is clearly new territory for Ghost, and not as a criticism, just be aware that you're doing something very new, and these things will happen.</p><p>The Netlify experience was super smooth in general. As I say, I managed to go from signup to working site on an netlify subdomain in under 10minutes (once I'd got Gatsby working locally). Switching the DNS was a bit pain because of the time it took to be able to put the certificate in place, but I was lazy and didn't lower my DNS's TTL before making the switch, so it was 15minutes when I changed, so I probably could have made my life easier there, but even so, there was at least <em>some</em> guaranteed downtime, which for a production website could be quite problematic.</p><p>Should you do it? Maybe, this blog has always been a bit of a trialing ground for me, if it was a production site then obviously you're going to want things to be more polished before you dive in, but for a production blog, the benefits I've listed above could be potentially <strong>more</strong> key than they are for me. That said there were some limitations I identified, that didn't effect me too much, but may cause you pain:</p><ul><li>Preview, with my configuration, only published posts go live, no preview links for the static site, if you did get them up and running, it'd take a while to deploy and reflect the new preview anyway (see below). You can preview on your ghost "backend" but that assumes you keep you ghost theme and static site theme in sync, which would be a bit of an overhead</li><li>Content update delay - with standalone ghost, content updates happen immediately, but with the static site it'll take 2-3 mins for the content changes to be reflected live. This isn't a problem for me, not least given the potential benefits, but if you need instant updates to content, this maybe isn't for you.</li></ul><p>And performance improvements?</p><p><strong>Before:</strong><br>Not bad, but no doubt there's room for improvement.</br></p><figure class="kg-card kg-image-card"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/Screen-Shot-2019-03-03-at-18.26.44.png" class="kg-image" alt="Static Ghost - (with Gatsby & Netlify)"/></figure><p><strong>After:</strong><br>A massive improvement, as well as a slight boost in accessibility, all good news!</br></p><figure class="kg-card kg-image-card"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2019/03/Screen-Shot-2019-03-03-at-20.46.14.png" class="kg-image" alt="Static Ghost - (with Gatsby & Netlify)"/></figure><h2 id="to-pwa-or-not-to-pwa">To PWA or not to PWA</h2><p>By default the gatsby ghost starter includes the offline plugin. This utilises a service worker to allow your site to be available offline, <strong>but</strong> I noticed a couple of issues (one minor, one major):</p><ul><li>By default it will offer to install your blog to the user's homescreen on mobile. I assumed no one wants to install my blog to their home page. Thankfully you can turn this off in the gatsby-config.js, in the <code>gatsby-plugin-ghost-manifest</code> section, set <code>display</code> to <code>browser</code>.</li><li>More concerning for a blog: The service-worker caching means that new posts and updates to posts seemed to appear indeterminately. For this reason I decided to disable the offline plugin for gatsby for now. I want to look into this more and determine what can be done to either control the caching, define the cache's ttl or similar.</li></ul><p>I'd like to revisit the PWA implementation in the future, because it is very cool, including pre-fetching blog posts and more. But I'd prefer to have up to date content for now!</p><h2 id="scheduled-posts">Scheduled Posts</h2><p><a href="https://twitter.com/i8ramin">i8ramin </a>asked on Twitter if the static site trigger would work with scheduled posts. The good news is that it does! But, there is a gotcha I discovered, I hadn't updated the ghost instance to use the new ghost url (it has a different subdomain as the static site now lives on www.tegud.net). Without doing that ghost's mechanism for scheduling posts (it fires a request to itself) won't work. So make sure you update your ghost <code>url</code> configuration value to be ghost, <strong>not</strong> the static site url.</p><h2 id="what-s-next">What's Next?</h2><p>Netlify helped me get up and running very quickly, but there's other options - S3 is a popular choice for hosting static sites, building our own nginx container to serve the content would allow it to be hosted in a wider range of places - in the interests of resiliency we could prepare the container and deploy it on demand.</p><p>So Gatsby was a success, but there's other options for static site generation out there. So I'll likely have a look at those next. I have some other ghost blogs I'll likely migrate to this system, and then look at what I can do to reduce costs of the ghost components. I'm going to see what I can do to extract some of the React Components I've used on my blog, and make them available, hopefully they will be of use to others.</p><p>Finally, the netlify pipeline I implemented was simple, but having a pipeline opens up new possibilities - also resizing & optimising of images, creating responsive images automatically, verifying content before publishing - far more than just, <strong>Build</strong> - <strong>Deploy</strong>.</p><p>Overall I'm really excited with the direction that Ghost is going, treating APIs and webhooks as first class citizens opens up a huge number of possibilities for ghost as a CMS platform, not just a blog. If you give ghost as a static site a go with Gatsby, or another tool, give me a shout on <a href="https://twitter.com/tegud">twitter</a>, I'd love to hear how other people are doing it.</p><p><em>Header Image: <a href="https://commons.wikimedia.org/wiki/File:No_Signal_23.JPG">Wikipedia </a></em></p>]]></content:encoded></item><item><title><![CDATA[Create Free Site Checks with Checkless]]></title><description><![CDATA[Checkless is a tool for monitoring your websites via simple HTTP checks, but unlike paid for SaaS offerings, you can do it for free (probably). Checkless is built on top of serverless technologies like AWS Lambda, and an open source framework for managing code in environments such as Lambda called Serverless [https://serverless.com/]. The good thing about serverless platforms such as AWS Lambda is they often have a free tier. This allows you to run quite a lot of code, for no cost. Once you]]></description><link>https://ghost.tegud.net/create-free-site-checks-with-checkless/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa361</guid><category><![CDATA[checkless]]></category><category><![CDATA[serverless]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Thu, 26 Jul 2018 15:10:57 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2018/05/top-banner-2.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2018/05/top-banner-2.jpg" alt="Create Free Site Checks with Checkless"/><p>Checkless is a tool for monitoring your websites via simple HTTP checks, but unlike paid for SaaS offerings, you can do it for free (probably). Checkless is built on top of serverless technologies like AWS Lambda, and an open source framework for managing code in environments such as Lambda called <a href="https://serverless.com/">Serverless</a>. </p><p>The good thing about serverless platforms such as AWS Lambda is they often have a free tier. This allows you to run quite a lot of code, for no cost. Once you go over the limit, it's likely you're not going to paying much either. For example, running 3 checks in 3 regions every 5 minutes can be as cheap as:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2018/05/estimation-basic.PNG" class="kg-image" alt="Create Free Site Checks with Checkless"><figcaption>Cost of 3 checks in 3 regions, running every 5minutes</figcaption></img></figure><p>And that's without the free tier included, with the free tier you'll pay nothing. I started Checkless to monitor my own websites (it was called Lambda Overwatch then). And received some interest in how people could reuse it, Checkless is the result of simplifying the configuration/deployment of that system, as well as some other improvements, such as Multi-Region support.</p><p>If you need journey tests or anything fancy like that, look at your Pingdom's, etc. But for a simple check for "Up or not?" it's by far the cheapest way I've found to do it.</p><h2 id="getting-started">Getting Started</h2><p><em><strong>Just show me an example: </strong>If you'd like to see a working example, including automation, <a href=" https://github.com/tegud/checkless-tegud">have a look at my site's checkless config</a>.</em></p><h3 id="install">Install</h3><p>First thing we need to do is install the pre-requisites, you'll need:</p><ul><li>NodeJS, version 8.10 or above</li><li>The Serverless CLI tool, install it with npm via: <code>npm i -g serverless</code></li><li>An AWS account and service account, with the correct permissions and a key</li></ul><p><strong>Note: </strong>Currently deployments are not working correctly on windows environments, it is recommended on using the Windows Subsystem for Linux (WSL) Bash to execute deployments if required. Though I'd really recommend using a free SaaS CI such as <a href="https://circleci.com/">CircleCI</a> or <a href="https://travis-ci.org/">TravisCI</a>.</p><p>Next, in your command line enter: <code>npm i -g checkless-cli</code>. This will install the Checkless CLI. If you'd prefer to do it without the CLI, I'll be publishing a "Checkless from scratch" guide soon.</p><h3 id="setup">Setup</h3><p>Once installed navigate to the directory where you will store your checkless configuration. I recommend keeping it in source control, such as GitHub, as it'll help you automate it and keep track of changes. </p><p>(If you already have a checkless.yml file, then you can skip this stage)</p><p>First enter:</p><p><code>checkless init</code></p><p>This will then take you through the stages of creating the initial configuration for your checks:</p><!--kg-card-begin: html--><script src="https://asciinema.org/a/9tkMWBeUT0qOjPAcJWskD2c0O.js" id="asciicast-9tkMWBeUT0qOjPAcJWskD2c0O" async=""/><!--kg-card-end: html--><p>Once complete you will have a checkless.yml file in your directory. If you open the file you should see contents similar to: </p><pre><code>region: eu-west-1 checks: tegud.net: url: https://www.tegud.net checkEvery: 5 minute regions: - eu-west-1 - us-east-1 notifications: - slack: webhookUrl: "https://myslackwebhook.com/ffdsfsdfgfdgrewasd"</code></pre><p>You should update the webhookUrl to your own slack webhook, so that checkless can tell you whether the check was a success or not. In time more notifications will exist, but slack is a good way of showing it's working.</p><p>Before moving on, make sure you execute</p><p><code>npm install</code></p><p>So that checkless itself is installed.</p><h3 id="estimate-the-cost">Estimate the Cost</h3><p>Once you've built your config you can check how much it will cost you. The estimate function of the CLI is very much that - an estimate. It's based on your checks being run at the rate specified for 30 days. So February will be cheaper, and months with 31 days will be more expensive.</p><p><code>checkless estimate</code></p><p>This will provide you with an estimate for your configuration. It's likely it'll be good news, it'll all fit in the Free Tier. If you've already used up your free tier then you can specify:</p><p><code>checkless estimate --ignore-free-tier</code></p><p>To ignore the free tier, if there is a cost to be paid it'll break it down by region/function/memory.</p><h3 id="generate-the-serverless-config">Generate the Serverless config</h3><p>Checkless leverages Serverless to handle deployment of the functions and other configuration. Serverless is a great way of managing your functions in AWS Lambda (and other serverless environments) on it's own, but it is quite verbose, the Checkless CLI is essentially a way of generating the Serverless config. </p><p>Within the directory with the checkless.yml configuration, execute:</p><p><code>checkless generate</code></p><p>This will generate the Serverless configuration and Checkless application assets per region within a .checkless folder. If you're managing in source control, add the folder to your `.gitignore' file to prevent it from being checked in. Once the configuration is generated, you can deploy checkless.</p><h3 id="deploy-checkless">Deploy Checkless</h3><p>Within the same folder the .checkless folder was generated by the previous step (likely the same place as your checkless.yml) execute:</p><p><code>checkless deploy</code></p><p>This will then execute a Serverless deploy in each region folder in sequence. This can take some time, and if any stage fails you may end up with some regions configured, but not all.</p><p>Once complete checkless should be configured in your AWS account. If the webhook is working right you should start to see:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2018/05/multi-region-1.PNG" class="kg-image" alt="Create Free Site Checks with Checkless"><figcaption>Checkless sending updates to Slack</figcaption></img></figure><h3 id="updates-more">Updates & More</h3><p>To update your checks, simply update the config, and then execute:</p><p><code>checkless generate</code> and then <code>checkless deploy</code></p><p>And the checks will be updated. </p><p>To update or remove checkless itself or the CLI, check out the <a href="https://www.github.com/tegud/checkless-cli">readme</a>.</p><h2 id="more-options">More Options</h2><p>At this point we should have our checks running as per the configuration. But there's more that can be configured, including notification options, check frequency, check expectations and more. Explore more Checkless blog posts, or the Github Repositories for more.</p><h3 id="checkless-blog-posts">Checkless Blog Posts</h3><p>But wait, there's more! I've only changed the name, there will soon be more Checkless blog posts so you can find out more, or setup your own free site check. Check back soon for more.</p><ul><li><a href="Checkless: Updating the Free Serverless Site Uptime Monitor"><strong>Checkless: Updating the Free Serverless Site Uptime Monitor </strong></a><strong> </strong>- Brief history of Lambda Overwatch and what changed when it moved to Checkless</li><li><em>More as they come...</em></li></ul><h3 id="checkless-repositories">Checkless Repositories</h3><p>Have a look at the checkless and checkless-cli repositories below if you want to jump straight to the code or getting started</p><ul><li><strong><a href="https://github.com/tegud/checkless-cli">Checkless CLI</a></strong> - Generate, Estimate Cost and Deploy Checkless from the CLI (or automate it!)</li><li><strong><a href="https://github.com/tegud/checkless">Checkless</a></strong> - base library, roll your own or contribute to the core modules.</li></ul>]]></content:encoded></item><item><title><![CDATA[Checkless: Updating the Free Serverless Site Uptime Monitor]]></title><description><![CDATA[Two years ago I created a personal project called lambda-overwatch, I wanted to check my sites were up, and I didn't want to pay anyone any money to do it. There was some interest in how I did it, and how people could replicate it. The former was quite easy, but replicating what I had built was not easy, so coming back to the project nearly two years later I had some goals: * Change the name - preferably one that wouldn't get me sued [https://smcarthurlaw.com/blizzard-settles-overwatch-tra]]></description><link>https://ghost.tegud.net/updating-the-free-serverless-site-uptime-checker-for-2018/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa360</guid><category><![CDATA[checkless]]></category><category><![CDATA[monitoring]]></category><category><![CDATA[free]]></category><category><![CDATA[pingdom]]></category><category><![CDATA[serverless]]></category><category><![CDATA[devops]]></category><category><![CDATA[operability]]></category><category><![CDATA[aws]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Sat, 26 May 2018 09:43:17 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2018/05/top-banner-1.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2018/05/top-banner-1.jpg" alt="Checkless: Updating the Free Serverless Site Uptime Monitor"/><p>Two years ago I created a personal project called lambda-overwatch, I wanted to check my sites were up, and I didn't want to pay anyone any money to do it. There was some interest in how I did it, and how people could replicate it.</p> <p>The former was quite easy, but replicating what I had built was not easy, so coming back to the project nearly two years later I had some goals:</p> <ul> <li><strong>Change the name</strong> - preferably one that wouldn't <a href="https://smcarthurlaw.com/blizzard-settles-overwatch-trademark-lawsuit/">get me sued</a>.</li> <li><strong>Enable multi-region checks</strong> - Serverless doesn't really support multiple regions, but the requirement is clear, being able to ensure a check websites from multiple locations is a big deal.</li> <li><strong>Make deploying your own checks simple</strong> - The configuration of lambda overwatch was <strong>not</strong> easy, at all, you basically had to take the building blocks I had assembled and rejig them how you wanted. The multi-region functionality also made simple installations even harder, so to really enable multi-region checks, configuration needed improving in general.</li> <li><strong>Ad hoc checks</strong> - "Alexa check tegud.net is up"...</li> <li><strong>Allow other notifiers</strong> - Don't just tell slack, send to alerting platform, send to a report, send to a webhook...</li> </ul> <p>As ever this is a work in progress, but first the name then...</p> <h2 id="lambdaoverwatchisnowcheckless">Lambda Overwatch is now Checkless</h2> <p>I still quite like the name Overwatch, and I'm not 100% on checkless, but I didn't really want a cease and desist off Blizzard and it was mildly confusing given the huge game the name shared, so <strong>Checkless</strong> then.</p> <p>It may not stick forever, it may change again, <a href="https://martinfowler.com/bliki/TwoHardThings.html">because developers are bad at naming things</a>, but for now it I believe it conveys that Checks are involved, and that it may be somehow related to serverless, which is the main thing for now.</p> <p>It also has the major advantage of not being a hugely popular name on npm, which given the other goals was important.</p> <p>Finally, Lambda is gone from the title, I'm looking multi-region for now, but why not multi cloud? Enabling checks from even more regions (and potentially using multiple free cloud allocations)!</p> <!--kg-card-end: markdown--><!--kg-card-begin: markdown--><h2 id="whatsnext">What's next?</h2> <p>My goal for Checkless is for it to continually evolve, that's only possible if it's easier to deploy, more portable and less coupled to my own configuration (which the original lambda overwatch repository in Github was very much so).</p> <p>Multi-region and configuration streamlining is mostly there, but the other goals are still to do, the list of posts below will be updated as more progress is achieved.</p> <h3 id="checklessblogposts">Checkless Blog Posts</h3> <p>But wait, there's more! I've only changed the name, there will soon be more Checkless blog posts so you can find out more, or setup your own free site check. Check back soon for more.</p> <ul> <li><strong>Create your own Free Site Check with Checkless</strong> - Getting started with Checkless</li> <li><em>More as they come...</em></li> </ul> <h3 id="checklessrepositories">Checkless Repositories</h3> <p>Have a look at the checkless and checkless-cli repositories below if you want to jump straight to the code or getting started</p> <ul> <li><strong><a href="https://github.com/tegud/checkless-cli">Checkless CLI</a></strong> - Generate, Estimate Cost and Deploy Checkless from the CLI (or automate it!)</li> <li><strong><a href="https://github.com/tegud/checkless">Checkless</a></strong> - base library, roll your own or contribute to the core modules.</li> </ul> <!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Dynamic Logging - Log Level Per Request]]></title><description><![CDATA[Whilst checking out the latest Thoughtworks Tech radar I noticed a technique that had been on my mind recently as well: Log Level per request. Logging is a vital tool for operations, without appropriate logging you're blind, but you always have to balance granularity and detail vs storage costs/overheads. The benefits of logging and why you should log more are best served for (several) other blog posts, for now I want to explore what varying Log Levels per Request can look like, and whether any]]></description><link>https://ghost.tegud.net/dynamic-logging-log-level-per-request/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa35f</guid><category><![CDATA[logs]]></category><category><![CDATA[operability]]></category><category><![CDATA[pickaroon]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Tue, 15 May 2018 19:48:48 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2018/05/pickaroons-1.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2018/05/pickaroons-1.jpg" alt="Dynamic Logging - Log Level Per Request"/><p>Whilst checking out the latest Thoughtworks Tech radar I noticed a technique that had been on my mind recently as well: <strong>Log Level per request</strong>.</p> <p>Logging is a vital tool for operations, without appropriate logging you're blind, but you always have to balance granularity and detail vs storage costs/overheads. The benefits of logging and why you should log more are best served for (several) other blog posts, for now I want to explore what varying Log Levels per Request can look like, and whether any log libraries actually support this technique.</p> <p>I'm <a href="#in-the-wild">compiling a list of libraties</a> that support Log Levels per Request and will update this post over time, so please <a href="http://www.twitter.com/tegud">tweet me</a> if you have any libraries you know of!</p> <h2 id="thegeneralprinciple">The General Principle</h2> <p>To balance storage contraints when logging we'll often restrict the amount we log, often by filtering based on log levels. These log levels will often look something like this:</p> <ol> <li><strong>DEBUG</strong> - Low level, extremely detailed log entries, would often have tens per request, such as starting operation x, completing operation x, calling Database Query y.</li> <li><strong>INFO</strong> - Do not indicate any level of fault within the system, can indicate standard operation feedback such as scheduled tasks being triggered or completing, requests being handled, startup/shutdown operations.</li> <li><strong>WARNING</strong> - This is something to be aware of, but you dont necesarily need to wake anyone up about it.</li> <li><strong>ERROR</strong> - Something's wrong and someone would need to be woken up to fix it.</li> </ol> <p>In production it's extremely rare the log level of Debug would be enabled, as the number of logs per request would be extremely high, if you're lucky you can cope with the load from having INFO level enabled, but that may not be the case. Which often leaves a lot of systems I've seen only logging Warning/Errors. This is generally fine for detecting something goes wrong in a way you've predicted, but it's rubbish for diagnosising unknown error states. This aspect of a system - the ability to observe how it's operating, not how it's failing in known ways - has become a more popular are of late, termed a system's Observability. I've encountered systems where it's hard to determine this, something goes wrong and you end up having to add logging in iteratively until you start to zero in on the problem, I imagine you may have similar experiences.</p> <p>Observability good then, but balancing it within a high traffic system can be challenging, this is where being able to dynamically adjust the level you're logging becomes key. With HTTP services the perfect method is available in Request Headers, e.g.:</p> <ul> <li><strong>X-Log-Level</strong>: DEBUG</li> </ul> <p>The logger would need to check this header when logging an item, with languages where functions are a first class citizen (e.g. JS, .NET, etc) this can be quite simple:</p> <script src="https://gist.github.com/tegud/40b057fd7a68e576cd844599592aff6c.js"/> <p>But the problem is it needs to be built <strong>into</strong> the logging framework. And most log framework's I've encountered treat log levels as static entities, or at best something that be modified for the <strong>entire</strong> system dynamically, not for specific requests (helpful, but would still result in A LOT of logging potentially).</p> <h2 id="distributedsystemstracingandmore">Distributed Systems, Tracing and More</h2> <p>Logging and Observability becomes even more critical in distributed systems. Correlation or Request IDs are extremely helpful in linking requests between systems so that all logs can be seen as one - what error sent to user was caused by a lower level microservice?</p> <p>For this purpose request-id's are passed between services, in this case I'd recommend passing the request log level header as well, so that the expanded logging propagates for a request to all depedant services, not just the first contacted endpoint.</p> <p>Of course the example of varying by explicit header is just one possible application, log level could be adjusted in this manner via cookie (include all users in an A/B test), or via random number generation (high detail logging for 40% of users), or more. The only usecase that would be particulary difficult to solve would be a common APM implementation: log outliers, as this would require the logs to be buffered outside of the application.</p> <p>As mentioned in the ThoughtWorks Tech Radar, this whole area also links in with <a href="http://opentracing.io/">Open Tracing</a> project - which is defintely worth checking out for more information.</p> <p><a name="in-the-wild"/></p> <h2 id="inthewild">In the wild</h2> <p>No libraries I've encountered in NodeJS or .NET support dynamic log levels, if you know of a library in any language that supports the technique please contact me or <a href="http://www.twitter.com/tegud">tweet me</a>.</p> <h3 id="nodejs">NodeJS</h3> <p>No popular libraries (winston/bunyan, etc.) support the technique, but:</p> <h4 id="pickaroonpickaroonexpress">Pickaroon/pickaroon-express</h4> <p><em>Disclaimer: I wrote this to experiment with dynamic log levels and other logging techniques.</em><br> <a href="https://github.com/tegud/pickaroon">https://github.com/tegud/pickaroon</a> / <a href="https://github.com/tegud/pickaroon-express">https://github.com/tegud/pickaroon-express</a><br> I started writing Pickaroon as a way of trying out some more advanced logging techniques around enforcing consistent log field names, request context and dynamic log level, for this reason the base library can accept a function as a log level, and the pickaroon-express library sets the log level from the x-log-level header (the code above is from that library).</br></br></p> <h3 id="net">.NET</h3> <p>No libraries support Request Log Level, but serilog can change the log level during runtime and <a href="https://blog.matthewskelton.net/2012/12/05/tune-logging-levels-in-production-without-recompiling-code/">Matthew Skelton blogged</a> in 2012 above changing the log level of individual log lines in the codebase by configuration.</p> <h4 id="serilog">Serilog</h4> <p><a href="https://serilog.net/">https://serilog.net/</a><br> Serilog can't do request based log levels, but it can <a href="https://nblumhardt.com/2014/10/dynamically-changing-the-serilog-level/">change the log level of the entire system dynamically</a>.</br></p> <!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack]]></title><description><![CDATA[Operability Kata? You may already be familiar with Katas [http://katas.softwarecraftsmanship.org/] , you take an arbitrary problem and practice your skills around it. You often repeat the same one to focus on how you approach the problem, not the actual problem itself. But why focus only on writing code? Skills such as analysing logs and finding insight from a large sea of data is vital to supporting services in live. After all, aggregating the logs is only half the challenge, do you have the sk]]></description><link>https://ghost.tegud.net/uk-general-election-tweet-analysis-with-elastic-stack/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa35e</guid><category><![CDATA[operability]]></category><category><![CDATA[logs]]></category><category><![CDATA[elasticsearch]]></category><category><![CDATA[kibana]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Mon, 12 Jun 2017 19:29:00 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/buckethead-elmo-may-680x434-1-2.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><h2 id="operabilitykata">Operability Kata?</h2> <img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/buckethead-elmo-may-680x434-1-2.jpg" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/><p>You may already be familiar with <a href="http://katas.softwarecraftsmanship.org/">Katas</a>, you take an arbitrary problem and practice your skills around it. You often repeat the same one to focus on how you approach the problem, not the actual problem itself. But why focus only on writing code? Skills such as analysing logs and finding insight from a large sea of data is vital to supporting services in live. After all, aggregating the logs is only half the challenge, do you have the skills to get what you need out of them?</p> <p>As part of talks for several local user groups on Log Aggregation, I often collect arbitrary data sets (Walking Dead/Game of Thrones premieres, finales) and used this for live demos. But can be it be valuable to take arbitrary data sets and practice log/data analysis skills on them, just as a learning experience?</p> <p><strong>Disclaimer:</strong> <em>I am not a political expert, this data is not representative, trying to read too much into its politics is ill advised</em></p> <h2 id="thedataset">The Data Set</h2> <p>Here in the UK, we just had a general election, because you can never have too much politics. But when politics looks like the header image it's hard to complain too much.</p> <p>9pm the day before the election, I started collecting tweets with the following keywords: corbyn, theresamay, #ge2017, #generalelection2017. The two leaders of the UK's main parties and what I could see at the time as the "official" hashtags of the general election (if such things can be official).</p> <h2 id="thegoal">The Goal</h2> <p>Take the General Election twitter data set, and attempt to find changes in behavior, insight or interesting facts from the data. Where possible link changes in data with known events.</p> <h2 id="summaryofthehow">Summary of the How</h2> <p>Setting up an Elastic Stack and getting data in is several large blog posts in itself, so suffice to say that I setup a single Elasticsearch node, one Logstash and a Kibana instance to handle the job at hand, all managed via Rancher/Docker for ease of setup.</p> <p>To get the data in I have an application I wrote years ago to connect to a Twitter Stream, run the sentiment analysis and send it to Logstash: <a href="https://github.com/tegud/TweetStreamToLogstash">TweetSteamToLogstash</a>.</p> <h2 id="sowhatdoesthedatatellus">So what does the data tell us?</h2> <h2 id="quantity">Quantity</h2> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/all.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>Overall tweets peaked at 10pm on the day of the election, around 164,000 tweets in a single hour - this is when the polls closed and the Exit Poll is released. Before this point it was widely accepted that it was going to be an easy win for the Conservatives, the Exit Poll saying a hung parliament (no party with an overall majority) was therefore quite a shock.</p> <h4 id="corbynvsmay">Corbyn vs. May</h4> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/by-leader-1.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>Corbyn commands a significant lead in terms of tweet quantity throughout the process, but more people are tweeting about the election in general than either leader.</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/by-party.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>Tweets of labour vs conservatives is much more evenly split, with more people referencing corbyn specifically than mentioning the labour party itself. Obviously the party names weren't part of the keyword set, so we're only picking up people who mention one of the other keywords and the party name as well, but it should work the same for both parties.</p> <h4 id="anomalyno1">Anomaly No. 1</h4> <p>Looking at the last two graphs I noticed a difference in the "Other" data, i.e. not labour/conservatives/party leader. Friday morning people were talking about a party that isn't Labour or Conservative, who is it and what happened?</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/by-party-dup.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>June 9th, at 9am suddenly a large number of people start talking about the DUP, a smallish party in Northern Ireland, which as you can see, previous to this point, rarely got a look in or mention. What happened? News broke that they were being placed as kingmakers in the minority government for the conservatives to maintain a majority. Suddenly a lot of people were very interested in the DUP:</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/google-dup-interest.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"><em>(data from google trends)</em></img></p> <h4 id="anomalyno2">Anomaly No. 2</h4> <p>The second anomaly I spotted was at 9pm on 9th June, DUP was continuing it's downwards trend after their appearance from no where but conservative and labour had a brief resurgence, can we work out what happened then?</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/resurgence.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>Zooming in the graph doesn't help matters much, the spike in interest lasts a brief 30minutes, so lets see what people were saying during that window.</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/terms.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>A terms aggregation suggests the top tweet during the period of 9pm to 10.30pm was to do with the general election, teen choice and the NBA Finals, a certainly odd combination. Then some rather more logical tweets about politics. But these tweets were happening for some time, they don't seem unusual for <em>just</em> this time period, we don't necessarily want to know what was most popular, but what was most popular significantly for this time period.</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/significant-terms.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>A significant terms aggregation shows not what is most popular, but what is proportionally more popular within a filtered aggregation than the wider population. So by filtering the data by a wider time period, applying a filter aggregation for the time period we care about and then a significant terms aggregation. As you can see above, everything returned by the significant terms aggregation was related to Kensington - Labour taking it was another unexpected result and was announced at 9pm.</p> <h2 id="sentiment">Sentiment</h2> <p>So we know when people were tweeting, and what caused that to change, but can any insight be gained from the sentiment of those tweets? Are labour or conservative tweets more positive? Was anyone happy about a hung parliament?</p> <p>Part of the ingestion applies a sentiment analysis on the tweet content - this splits the tweet into tokens and then finds ones with positive or negative meaning, and adds them together to find a figure for the ultimate sentiment. It's basic, but it gives a general feeling for the types of word being used.</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/sentiment-by-leader.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>Let's start off with the sentiment of each leader. Corbyn's red line tracks closely with overall sentiment of the general election, just about positive. Theresa May fares poorly, generally negative, and getting worse at midday on the 9th. What happened then?</p> <p>If we ignore retweets (NOT RT), then we don't see the same pronounced drop off in sentiment for Theresa May:</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/sentiment-by-leader-no-rts.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>So the negative tweets are a RT spreading quickly at that point, lets have a look at the significant terms tweets, focused on when the sentiment drops.</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/may-sentiment-drop-tweets.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <p>Oh dear, it appears it took a few hours for people to get good and angry about the Conservatives working with the DUP - the tweets significant for the period of Theresa May's tweet sentiment dropping (12pm - 4pm) are all to do with DUP and terrorism.</p> <p>Sadly, as I didn't enable fielddata on the tweet content, I couldn't analyse which words were being used for various tweets as I have previously, definitely something I need to remember when using Elasticsearch 5.x.</p> <h2 id="insummary">In Summary</h2> <h3 id="whatdidilearn">What did I learn?</h3> <p>So in kata's spirit of learning and iterating, what did I learn this time that I would do differently next?</p> <ul> <li><strong>Store more tweet data</strong> - I didn't have a chance to update TweetStreamToLogstash to do geo location if available, I think this would have been an interesting extra dimension to the data analysis. Ultimately I want to move the sentiment analysis functionality into a Logstash filter, but i have to build myself up to writing some ruby first.</li> <li><strong>Poor little Elasticsearch Node</strong> - It turns out that was quite a lot of data for one $10 vultr node, I pushed it over the edge several times in the course of the data analysis (Load average of 30 for a 1 core machine is <strong>not</strong> recommended)<br> <img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/poor-node.PNG" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></br></li> <li><strong>Elasticsearch 5 changes</strong> - This was an omission on my part, but I'd forgotten to set the text fields (non-keyword analysis of string field mappings in Elasticsearch) to enable fielddata, meaning that I could not aggregate on the tokenised version of the tweet content. I can rectify this by re-indexing into another Elasticsearch index so that I can change the mappings of the particular type, but I'll leave that for another day and blog post.</li> </ul> <h3 id="operabilitykatadiditwork">Operability Kata - did it work?</h3> <p>Did I find the exercise useful? Exploring this data provided some unique challenges not usually present in web server/application log analysis. I got to use in anger some tools I hadn't used a great deal (Kibana/Elasticsearch 5.x), and I got reuse some skills I'd used before in a more relaxed environment - usually if I'm using significant terms on log data, it's because something bad has happened and I've no idea what it is.</p> <p>Would it be worthwhile reusing the same data set? Possibly - especially once I've re-indexed it with fielddata enabled for the tweet content. Other data sets would also be good to look into. Comparing the skills and techniques I used to analyse the tweets to those I've used for proxy/application logs in the past, there were a lot of parallels, determining where the anomalies are and what triggered them, removing noise, are all skills that are key when trying to determine root cause on an issue, live or retrospectively. The anomaly detection side of thing I'd also really like to explore bringing in the new Machine Learning or graphing capabilities of Elasticsearch in future exercises.</p> <p>But I'm also interested in the other directions an Operability Kata could take. Be it configuration management, secret management, provisioning servers - there's lots of techniques that I only try out when I need to build something new, or just haven't got round to. Practicing these things seems as key as practicing writing better code. I'm not certain what form these other katas could take, log data analysis is quite easy as the data is easily available, and exploring that data is in itself, the challenge - but can we easily do the same for other areas?</p> <h3 id="backtopolitics">Back to politics?</h3> <p>As for political analysis? If I were to judge the winner of the night? I'd have to say Lord Buckethead, without hesitation:</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2017/06/lord_buckethead-1.jpg" alt="Operability Kata - UK General Election 2017 Tweet Analysis with Elastic Stack"/></p> <!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Building a (nearly) free site ping on AWS Lambda]]></title><description><![CDATA[Knowing if your website is up or down is important, having recently expanded to running a few websites, instead of just this blog, I wanted a way of having confidence everything was working! Nothing complicated - just a HTTP request and confirm that a 2xx response is received. Previously, my experience had been setting up monitoring on a larger scale - decent sized VMs, running Icinga2, checking tens to hundreds of sites/services. This time I was faced with the challenge of monitoring a small n]]></description><link>https://ghost.tegud.net/building-a-nearly-free-site-ping-on-aws-lambda/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa35b</guid><category><![CDATA[nodejs]]></category><category><![CDATA[lambda]]></category><category><![CDATA[monitoring]]></category><category><![CDATA[aws]]></category><category><![CDATA[cloud]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Thu, 08 Sep 2016 18:00:33 GMT</pubDate><media:content url="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/header-1473194310800.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/header-1473194310800.jpg" alt="Building a (nearly) free site ping on AWS Lambda"/><p>Knowing if your website is up or down is important, having recently expanded to running a few websites, instead of just this blog, I wanted a way of having confidence everything was working! Nothing complicated - just a HTTP request and confirm that a 2xx response is received.</p> <p>Previously, my experience had been setting up monitoring on a larger scale - decent sized VMs, running Icinga2, checking tens to hundreds of sites/services. This time I was faced with the challenge of monitoring a small number of websites, with the minimum amount of cost possible.</p> <p>I looked at the "Site ping" tooling market (is that a market? I'm not sure, theres a few options...) and everything was a bit more expensive than I'd hoped, around £8 per month to check the sites seemed excessive - doing it myself seemed like it <strong>had</strong> to be cheaper!</p> <h2 id="balancingstabilityandcost">Balancing stability and Cost</h2> <p>With only a few site checks to do, I wanted to have as little additional cost as possible, whilst providing peace of mind that my own, and the other websites were working. This blog and my other sites are run on a simple Rancher cluster, so a monitoring container was an option, but quickly identified some problems with this approach:</p> <ul> <li><strong>Sacrifice capacity for visibility</strong> - Setting up a container that was only needed once every <em>n</em> minutes to check the sites were up seemed a bit overkill, it'd be sat there, using resources (admittedly not a great deal, but still, it's a waste!)</li> <li><strong>Don't monitor what you're monitoring from the same box</strong> - It's generally not a good idea to monitor something from the same box it's on. Container death/code issues would have been detected, but if the rancher cluster dies, or the VM itself, no alert. Not so good.</li> </ul> <p>Based on this I wrote off a container in the same cluster as the sites themselves. Standing up a separate VM was an option, and cheaper than commercially available options, but it was still more than I wanted to pay.</p> <h2 id="100ofthecostfor1ofthetime">100% of the cost for 1% of the time</h2> <p>What got me thinking was the fact that for 4 minutes, 57 seconds in every 5 minutes, the server would be doing nothing. I don't want to have pay for them doing nothing, but there's no point standing up/tearing down VMs every 5 minutes...</p> <h3 id="serverless">Serverless!</h3> <div style="float: right; padding: 0 0 10px 10px"><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/reinvent_launch_page_illustration_lambda-1473357469405.png" alt="Building a (nearly) free site ping on AWS Lambda"/></div> <p>So I had a look at AWS Lambda, part of what's termed "Serverless", or <a href="https://serverless.zone/serverless-is-just-a-name-we-could-have-called-it-jeff-1958dd4c63d7?gi=b80b440e06a5" target="_blank">"Jeff"</a> in the industry at large - I'm not going to get into why I think it's a bad name, plenty of people have covered the topic.</p> <p>In this case, Lambda seemed like a really good fit, I don't need the servers all the time, lambda is available on AWS's free tier and it supports NodeJS 4.3. It also gave me a location outside of Digital Ocean, and the potential to ping my sites from multiple locations (e.g. Ireland, Frankfurt, Singapore, etc). I also found <a href="http://engineering.curalate.com/2016/06/14/url-availability-with-lambda.html" target="_blank">this blog post</a> which confirmed I was on the right track, but lacked code examples.</p> <h2 id="gettingstarted">Getting Started</h2> <p>To begin with, as I was feeling my way round, I started off using the AWS console for everything. I had an account already as I use Route53 to manage my Rancher DNS, but if you don't have one, you'll need a credit card to get set up.</p> <h4 id="beforewebegin">Before we begin</h4> <p>I'll focus on the lambda setup below, if you want to setup lambda-overwatch, you'll need the following SNS topics and roles, so you may want to <a href="https://github.com/tegud/lambda-overwatch/blob/master/requirements.md" target="_blank">create these first</a> before delving into the code.</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/lambda-1473108492732.png" alt="Building a (nearly) free site ping on AWS Lambda"/></p> <p>Once you have your account and the prerequisites, head straight over the Lambda section in the Compute area. Select "Create" function, and it'll take you to a wizard to create your first function.</p> <h3 id="functionno1makerequest">Function No.1: Make Request</h3> <p>The first function we're going to need will actually make our request. The Lambda Create Function wizard will first ask if we want to use a boilerplate function. In this case, just skip straight past this stage (bottom of the screen)</p> <h3 id="thetrigger">The Trigger</h3> <p>Next it will ask for our trigger, we want it to execute every <em>n</em> minutes, so we need a "Cloudwatch Events - Schedule" trigger. Once the trigger type is selected you can set the name/description and all important expression. The expression can either be a rate expression, or a CRON specification. In my case, a <code>rate(5 minutes)</code> was all I needed.</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/create_trigger-1473187305180.png" alt="Building a (nearly) free site ping on AWS Lambda"/></p> <h3 id="thecode">The Code</h3> <p>The final screen then requests the details of the lambda function itself. You can select to enter the code inline, upload a zip file, or retrieve the zip file of S3 storage. To start off with I just pasted the code into the editor window, but that gets tedious pretty quickly - to get up and running though, it's fine!</p> <p>The code I wrote is very simple, it creates a HTTP request to a URL specified in the incoming event. Once the response comes back, we parse the response and emit an event to a SNS topic. Additionally we handle timeouts, and again emit an event on to an SNS topic for other functions to process the response.</p> <p>This level of composability with lambda is actually really nice, and I could quickly see different ways in which small dedicated functions could be arranged, with SNS/SQS's gluing them together.</p> <p>To use my code, you'll need to replace <code>%RESULT_SNS_TOPIC_ARN%</code> with your SNS ARN - you can find it <a href="https://github.com/tegud/lambda-overwatch/blob/master/make-request/index.js">hosted in github</a>.</p> <h3 id="settingtheurl">Setting the URL</h3> <p>I mentioned above that incoming event will include the url, but by default the Cloudwatch Scheduled Event trigger just includes a bunch of meta data and some things to do with scheduling, unsurprisingly, nothing about my URLs. Fortunately we can override the event a trigger supplies lambda, with some arbitrary JSON, e.g.: <code>{ "url": "http://www.tegud.net" }</code>.</p> <p>All you need to do is go to the "Triggers" tab, select the trigger we created earlier and in the top right click "Actions -> Edit".</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/create_trigger_options-1473191581183.png" alt="Building a (nearly) free site ping on AWS Lambda"/></p> <p>On the the right you should see a Configure input section as above, select "Constant (JSON text)" and you can override the entire event JSON as required. In theory we could add additional properties, allowing for dynamic timeouts and the like (be careful of the default max execution time on a lambda function though).</p> <p>Once the target is configured correctly, you should be able to set it going, but we wont hear anything to begin with...</p> <h2 id="simplenotification">Simple Notification</h2> <p>We check a HTTP endpoint every 5 minutes, but we don't do anything with the result! We could stick ever more logic into make-request, so that if it fails it publishes it directly to the failure SNS topic, but I preferred to separate the two.</p> <p>Lambda has a maximum execution time of 5 minutes, and a default cut off on functions of 3 seconds. These fairly aggressive timeouts put you against the clock - but I actually found that it made me think about how much each function should be doing as little as possible.</p> <p>Should this make a request AND decide what to do? Better to get the execution of the request done, publish it somewhere safe, like an SNS topic, and then let another dedicated function pick it up.</p> <h3 id="functionno2processresult">Function No.2 Process Result</h3> <p>We create our function the same as with the last function, but this time the trigger is going to be the SNS we published to last time. Fortunately this is just a matter of selecting SNS Topic from the trigger list, and enter it's ARN (if using the "Before you begin" naming - resultComplete).</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/create_trigger_sns-1473192365888.png" alt="Building a (nearly) free site ping on AWS Lambda"/></p> <p>Once that's done, we paste in our <a href="https://github.com/tegud/lambda-overwatch/blob/master/handle-result/index.js">process result function code</a>. This time we're making decisions based on the status and publishing to ANOTHER SNS topic on complete/failure. Why separate ones? You're intended behavior may be different, but I want to get an email when a site is down, and I'd like to see in slack whenever a check has been carried out.</p> <p>Once we have that code in place, we can subscribe to our SNS via email and get a lovely formatted email when a site check fails:</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/site_fail_email-1473194678325.jpg" alt="Building a (nearly) free site ping on AWS Lambda"/></p> <p>Clearly not going to win any design awards, but it does the job! If a prettier email is required, then we could always hand it off to another function that will format it and use SES to send the email.</p> <h2 id="fancyslacknotification">Fancy Slack Notification</h2> <p>An email when something goes wrong is great, but it's nice to have the reassurance that everything is working ok (especially when I was starting off and there were many more issues with lambda - did the site fail, the lambda function exit with an error?).</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/slack_alert-1473193334683.png" alt="Building a (nearly) free site ping on AWS Lambda"/></p> <h3 id="functionno3formatforslackwebhooks">Function No.3 Format for Slack Webhooks</h3> <p>Very similar as the previous one, just we're connecting to the "completedCheck" SNS topic. <a href="https://github.com/tegud/lambda-overwatch/blob/master/send-to-slack/index.js">The code</a> this time creates a HTTP request to a Slack Webhook endpoint. You can configure this through your teams apps/integrations, just take the URL it provides and replace <code>%SLACK_WEBHOOK_PATH%</code> with it.</p> <p>Once sorted you should see your check results appear in the slack channel you selected on creating the integration.</p> <h2 id="gettingcomplicated">Getting Complicated</h2> <p>After I built the prototype, I realised there were so many places I could take this.</p> <ul> <li>Automated Build/Test/Deploy of the functions</li> <li>Save reports to s3 buckets (including responses and headers)</li> <li>Build a front end to expose the last results as a status page</li> <li>Utilise a NodeJS selenium driver to attempt to run user journeys in Lambda</li> <li>Other site based checks such as checking for asset sizes, and the like.</li> <li>Complete AWS Configuration - take out the manual steps above, possibly using the serverless framework.</li> </ul> <p>I'm half way through the Automated Build/Test/Deploy stage at the moment, so I'll probably finish that first and follow up with a blog post on how I built the work flow! This makes the managing of the functions <em>much</em> easier, especially the handling the replacement of SNS ARN's and Slack webhook address.</p> <h2 id="isitcheap">Is it cheap?</h2> <p>One of the key points of using lambda was the potential cheap cost, instead of starting up a VM in EC2 and leaving it there all the time, was it cheaper?</p> <p>There's a <a href="http://blog.matthewdfuller.com/p/aws-lambda-pricing-calculator.html" target="_blank">dead handy online calculator</a> I used to estimate the cost, and it came out as a whole $0.68. Quite a bit cheaper than £8 then. As it stands Lambda is also in AWS's free tier, so no charge for a year either. Pretty cheap for some piece of mind!</p> <p><img src="https://s3-eu-west-1.amazonaws.com/tegud-assets/2016/09/cost-1473195006933.jpg" alt="Building a (nearly) free site ping on AWS Lambda"><br> (Assuming 3x checks every 5 minutes for 31 days a month, with a execution time of 12s per check, which is <em>extremely</em> pessimistic, actual cost should be quite a great deal cheaper, even ignoring the free tier)</br></img></p> <h2 id="githublink">Github Link</h2> <p>You can find all my source code for the project so far on Github:</p> <p><strong><a href="https://github.com/tegud/lambda-overwatch" _target="blank">https://github.com/tegud/lambda-overwatch</a></strong></p> <p>Feel free to fork it for your own purposes, it doesn't include the AWS setup for now though.</p> <p>[Image credit wikipedia, creative commons: <a href="https://en.wikipedia.org/wiki/Sonar">https://en.wikipedia.org/wiki/Sonar</a>]</p> <!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Introducing grunt-juve]]></title><description><![CDATA[Recently I've been in need of a method to easily performance test pages and pass/fail CI builds based on the results. A lot of our previous attempts have utilised selenium based tests, but these were heavy weight and unwieldy. At Velocity 2014 phantomas [https://github.com/macbre/phantomas] caught my eye as alternative. I found grunt-phantomas [https://github.com/stefanjudis/grunt-phantomas] but it's quite tied to it's HTML output and I wanted more flexible options. I also found Juve [https://gi]]></description><link>https://ghost.tegud.net/introducing-grunt-juve/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa356</guid><category><![CDATA[grunt]]></category><category><![CDATA[nodejs]]></category><category><![CDATA[performance]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Tue, 18 Feb 2014 22:45:00 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Recently I've been in need of a method to easily performance test pages and pass/fail CI builds based on the results. A lot of our previous attempts have utilised selenium based tests, but these were heavy weight and unwieldy. At Velocity 2014 <a href="https://github.com/macbre/phantomas" target="_blank">phantomas</a> caught my eye as alternative. I found <a href="https://github.com/stefanjudis/grunt-phantomas" target="_blank">grunt-phantomas</a> but it's quite tied to it's HTML output and I wanted more flexible options. I also found <a href="https://github.com/jared-stilwell/juve" target="_blank">Juve</a>, a test runner and assertion wrapper for phantomas, only problem - no grunt integration.</p> <h3 id="buildsomethingthen">Build something then?</h3> <p>So I built something, and enter: <a href="https://github.com/tegud/grunt-juve" target="_blank">grunt-juve</a>. A grunt wrapper for Juve - you basically define an array of urls, and associated assertions, and it will pass/fail based on these.</p> <h4 id="basicusage">Basic usage</h4> <p>As with all grunt modules, start by installing the npm.</p> <pre><code>npm --install grunt-juve --save-dev </code></pre> <p>Next tell grunt to load it:</p> <pre><code>grunt.loadNpmTasks('grunt-juve'); </code></pre> <p>Then you just need to add the grunt-juve section into your config object:</p> <pre><code>grunt.initConfig({ juve: { my_site: { options: { tests: [{ url: 'http://www.tegud.net', assertions: { htmlSize: 10 } }] } }, }, }); </code></pre> <p>This expects the size of the html to be 10bytes or lower, this obviously won't pass, so we get the following output from the default reporter:</p> <pre><code>Executing Juve for 1 url... >> http://www.tegud.net failed, 1 of 1 assertion failed. >> Assertion htmlSize failed, expected: 10, was: 9017 Warning: Performance tests failed. Use --force to continue. </code></pre> <p>And the important part? Grunt failed. So I can very easily integrate this with a CI pipeline and get it to pass/fail based on urls & assertions defined.</p> <h4 id="configfiles">Config Files</h4> <p>As well as putting all your config into your grunt file, you can alternatively reference external files. This was so that for different environments you can reference different files, instead of overloading the gruntFile with excessive amounts of configuration for multiple environments. You specify the file such as:</p> <pre><code>juve: { 'tegud': { options: { file: 'ci.json' } } } </code></pre> <p>In this case the ci.json file just contains a json configuration, e.g:</p> <pre><code>{ "tests": [{ "url": "http://www.gruntjs.com", "assertions": { "htmlSize": 10 } }] } </code></pre> <p>But the tests run in exactly the same way.</p> <h3 id="nextsteps">Next Steps</h3> <p>The basic grunt reporter in v0.0.2 simply outputs the basic information you need to know what failed. A more feature complete reporter is planned with support with verbose mode as well as support for user specified reporters.</p> <ul> <li><strong>0.0.3:</strong> Grunt reporter, fully featured with configurable log level, and support for verbose mode</li> <li><strong>0.0.4:</strong> LogStash reporter, UDP broadcaster that transmit a logstash compatible message format to feed into our metrics collection system to easily graph the metrics over time</li> <li><strong>1.0.0:</strong> Bug Fixes and 1.0 release</li> </ul> <h3 id="projectlinks">Project Links:</h3> <ul> <li><a href="https://github.com/tegud/grunt-juve" target="_blank">GitHub</a></li> <li><a href="https://www.npmjs.org/package/grunt-juve" target="_blank">NPM</a></li> </ul> <!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Looking at Gulp, an alternative to Grunt]]></title><description><![CDATA[I’ve been using Grunt for a while now, it’s a great way of managing your JavaScript workflow, for both client side and server side applications. We introduced it into our workflow for managing the minifying of JavaScript and running of tests/linting code, and we’re looking to expand its use further. Grunt has great community support and a huge range of plugins. So it was with some scepticism I looked at Gulp, a new JavaScript task runner. My initial question of course being, “Why not just use G]]></description><link>https://ghost.tegud.net/looking-at-gulp-an-alternative-to-grunt/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa357</guid><category><![CDATA[nodejs]]></category><category><![CDATA[gulp]]></category><category><![CDATA[buildtools]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Wed, 15 Jan 2014 20:12:00 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>I’ve been using Grunt for a while now, it’s a great way of managing your JavaScript workflow, for both client side and server side applications. We introduced it into our workflow for managing the minifying of JavaScript and running of tests/linting code, and we’re looking to expand its use further. Grunt has great community support and a huge range of plugins.</p> <p>So it was with some scepticism I looked at Gulp, a new JavaScript task runner. My initial question of course being, “Why not just use Grunt?”. Having spent some time with Gulp though, I unexpectedly found myself enjoying using it.</p> <h3 id="imperativevsdeclarative">Imperative vs Declarative</h3> <p>In Grunt you configure what you want to happen, this generally takes the form of a JavaScript object with properties defining the configuration of the tasks within your workflow. This works well once you know what you’re doing, but I found there was a learning curve to get to where you knew what you’re doing with your GruntFiles. Large and complex configurations can also be time consuming to read - it’s not particularly self-documenting.</p> <p>Gulp is different, you tell it what to do, it feels much more like programming than Grunt does. The main advantage I found was when you come to Gulp, you already pretty much know how to use it, because you can program JavaScript. The example GulpFile I built below for my node app (hence no concat or minification) hopefully shows how simple it can be.</p> <pre><code>var gulp = require('gulp'); var mocha = require('gulp-mocha'); var gutil = require('gulp-util'); var jshint = require('gulp-jshint'); var jscs = require('gulp-jscs'); var watch = require('gulp-watch'); var src = './src/**/*.js'; var tests = './test/tests/**/*.js'; gulp.task('test', function () { gulp.src(tests, { read: false }) .pipe(mocha({ reporter: 'spec' })) .on('error', gutil.log); }); gulp.task('lint', function() { gulp.src([src, tests]) .pipe(jshint()) .pipe(jshint.reporter('default')) .pipe(jscs()) .on('error', gutil.log); }); gulp.task('default', function(){ gulp.run('test'); gulp.run('lint'); }); gulp.task('watch', function() { gulp.src([src, tests], { read: false }) .pipe(watch(function(events, cb) { gulp.run('default', cb); })); }); </code></pre> <p>Gulp takes files as streams and pipes them through plugins. If you understand how streams work, then it’s pretty easy to slot your tasks together, but if you’re not familiar with node streams then it might not make sense immediately, especially when you trying to setup gulp-watch.</p> <h3 id="butgruntworkswhybother">But Grunt works why bother?</h3> <p>A lot of the discussion has been around whether Gulp needs to exist when we have Grunt. Thankfully there’s also been plenty of sensible people pointing out that though they’re both task runners, there are situations where each excels.<br> Gulp, by placing streams at it’s core, needs something to stream, so if you’re wanting to do tasks without files, I’m not sure how well it’d work. Grunt has a huge plugin base already, where-as Gulp is newer so has less (though I didn’t find something I wanted to use that wasn’t supported, even the very new JavaScript Coding Standards npm was supported). I think Gulp is ideally suited to node js applications, as developers will likely already have a good understanding of streams.</br></p> <p>Should you switch projects to Gulp instead of Grunt? No, there’s really no point. But if you’re starting a new project and you’re familiar with streams, I’d definitely recommend giving it a go, especially for Node applications.</p> <p>All in all I found Gulp was a very pleasant development experience, and will definitely been keeping it mind when selecting a task runner for new projects, especially NodeJS.</p> <!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[NodeJS Http Requests stuck at five]]></title><description><![CDATA[Whilst working on our monitoring system we encountered some rather odd behaviour sending data into Elasticsearch over HTTP. We would get 5 responses, and then it would abruptly stop. We were calling request.end(), so as far as we knew we were ending the connection, after some googling and a rather helpful stackoverflow post we discovered the issue was we were not consuming the data. Once we added in: response.on('data', function() {}); All the data started flowing in. The NodeJS API guide fo]]></description><link>https://ghost.tegud.net/nodejs-http-requests-stuck-at-five/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa359</guid><category><![CDATA[nodejs]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Tue, 10 Sep 2013 19:25:00 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Whilst working on our monitoring system we encountered some rather odd behaviour sending data into Elasticsearch over HTTP. We would get 5 responses, and then it would abruptly stop.<br> We were calling request.end(), so as far as we knew we were ending the connection, after some googling and a rather helpful stackoverflow post we discovered the issue was we were not consuming the data. Once we added in:</br></p> <pre><code>response.on('data', function() {}); </code></pre> <p>All the data started flowing in. The NodeJS API guide for the Http Request class seems to be notably missing this rather core piece of information, so if you hit upon a limit of five requests, or massive memory leaks, then make sure you’re calling on .on(‘data’... even if you don’t want to do anything with it!</p> <p>If it helps, our code to call elasticsearch over HTTP looked like</p> <pre><code>var params = { host: this.host, port: this.port, method: 'POST', path: elasticSearchPath }; var json = JSON.stringify(data); var req = http.request(params, function(res) { if (res.statusCode != 201) { this.error_buffer.emit('error', new Error('Wrong ElasticSearch return code ' + res.statusCode + ', should be 201. Data: ' + json)); } else { this.error_buffer.emit('ok'); } res.on('data', function() {}); }.bind(this)); req.on('error', function(err) { this.error_buffer.emit('error', err); }.bind(this)); req.end(json); </code></pre> <h4 id="update">Update:</h4> <p>After searching through the node documentation, it does actual reference the need to read the data, it’s just extremely easy to miss. “Also, until the data is read it will consume memory that can eventually lead to a 'process out of memory' error.”</p> <!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Finite State Machines in JavaScript]]></title><description><![CDATA[Unfortunately I didn’t have a chance to go to this years jQuery UK conference, but a friend was tweeting as it happened and mentioned Doug Neiner’s talk on Machina.js (http://events.jquery.org/2013/uk/schedule.html#doug). This got me looking at the Finite State Machine library and I started to see immediate applications to what I was working on. Machina does lots, but if you just want a simple state machine, there’s an awful lot of code. This led me down the path of building my own smaller Finit]]></description><link>https://ghost.tegud.net/finite-state-machines-in-javascript/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa35a</guid><category><![CDATA[javascript]]></category><category><![CDATA[finite-state-machines]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Wed, 05 Jun 2013 13:33:00 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Unfortunately I didn’t have a chance to go to this years jQuery UK conference, but a friend was tweeting as it happened and mentioned Doug Neiner’s talk on Machina.js (<a href="http://events.jquery.org/2013/uk/schedule.html#doug" target="_blank">http://events.jquery.org/2013/uk/schedule.html#doug</a>). This got me looking at the Finite State Machine library and I started to see immediate applications to what I was working on. Machina does lots, but if you just want a simple state machine, there’s an awful lot of code. This led me down the path of building my own smaller Finite State Machine.</p> <h4 id="notnotbuiltherehonest">Not “Not built here”, honest</h4> <p>Not just a matter of “not built here”, my decision to build my own was mainly down to Machina JS being 5kbs compressed. The fact it included lots of functionality is great for when you want to extend and integrate with a message bus, but in this case I just wanted to have a really simple implementation. What’s more I found it an ideal way to increase my understanding of Finite State Machines.</p> <h4 id="sowhydoievenwantone">So why do I even want one?</h4> <p>If you’re familiar with Finite State Machines, you can probably skip this bit. If you’ve not encountered Finite State Machtines, you’ve almost certainly created state within your code.</p> <p>We often represent state within code without explicitly meaning to. This often takes the form of status booleans, e.g. isClosed or hasLoaded. Keeping track of state this way and managing the values of these flags can become a nightmare to maintain and make it much harder for new programmers to understand your code.</p> <p>An example of managing state using flags is this expanding block:</p> <pre><code>var ExpandingBlock; (function() { ExpandingBlock = function(element) { var isOpen = true; function animationInProgress() { return element.is(':animated'); } function open() { if(!animationInProgress()) { element.slideDown({ complete: function() { isOpen = true; } }); } } function close() { if(!animationInProgress()) { element.slideUp({ complete: function() { isOpen = false; } }); } } return { open: open, close: close, toggle: function() { if(isOpen) { close(); } else{ open(); } } }; }; })(); $(function() { var block = new ExpandingBlock($('#block')); $('#button-set') .on('click', '#open', function() { block.open(); }) .on('click', '#close', function() { block.close(); }) .on('click', '#toggle', function() { block.toggle(); }); }); </code></pre> <p>The most obvious state flag in the code is:</p> <pre><code>if(isOpen) { close(); } else{ open(); } </code></pre> <p>This indicates at least two states existing within the code - open and close. However, another if statement indicates yet another two states:</p> <pre><code>if(!animationInProgress()) { … } </code></pre> <p>So if we’re currently animating, don’t do anything, which is the case for both opening and closing.</p> <p><img src="https://tegud-assets.s3.amazonaws.com/2016/06/StateDiagram-1465712665597.png" alt="Expanding box state diagram"/></p> <p>Four states then, Open, Closing, Closed and Opening. And we can transition from open to closing, closing to closed, closed to opening and opening to open. With that identified we can draw our state diagram:</p> <p>Same functionality, implemented using a Finite State Machine:</p> <pre><code>var ExpandingBlock; (function() { ExpandingBlock = function(element) { var finiteStateMachine = new nano.Machine({ states: { open: { close: function() { this.transitionToState('closing'); }, toggle: function() { this.transitionToState('closing'); } }, closing: { _onEnter: function() { var machine = this; element.slideUp({ complete: function() { machine.transitionToState('closed'); } }); } }, closed: { open: function() { this.transitionToState('opening'); }, toggle: function() { this.transitionToState('opening'); } }, opening: { _onEnter: function() { var machine = this; element.slideDown({ complete: function() { machine.transitionToState('open'); } }); } } }, initialState: 'open' }); return { open: function() { finiteStateMachine.handle('open'); }, close: function() { finiteStateMachine.handle('close'); }, toggle: function() { finiteStateMachine.handle('toggle'); } }; }; })(); $(function() { var block = new ExpandingBlock($('#block')); $('#button-set') .on('click', '#open', function() { block.open(); }) .on('click', '#close', function() { block.close(); }) .on('click', '#toggle', function() { block.toggle(); }); }); </code></pre> <p>Instead of boolean flags and if statements, we define the possible states in an object, as well as the initial state (in this case open). Event handlers defined on each state define what events that state will accept. By defining these for each state it allows to not have to check what state we’re in and control the transitions between states. I.e. you can’t go from open to opening. This implementation of Finite State Machines (same as machina.js) supports another event handler - _onEnter. This is automatically called when the state machine enters this state, this allows setting of functionality that will always occur when the machine enters that state, e.g. opening will always cause the block to start the expanding animation.</p> <p>Using states and event handlers to manage the block’s expanding and collapsing means we don’t need to check if the block is already open or not - it’s either in an open or closed state. Neither do we need to check if an animation is in progress (and exit immediately if it is), in an opening or closing state we don’t listen for any event handlers at all. This removes lot’s of nasty if statements and branching code!</p> <h4 id="buildingnanomachinejs">Building nanoMachine.js</h4> <p>Building a Finite State Machine implementation really helped me understand how they can be used, so whether it ends up of being of use to anyone or not, it was a valuable exercise. The final implementation of nanoMachine.js comes in at 310bytes compressed, so if you don’t need all the functionality of machina.js consider using nanoMachine.js.</p> <h4 id="tldr">TL;DR</h4> <p>You can see the source of nanoMachine.js and download the minified version on GitHub</p> <p><a href="https://github.com/tegud/nanoMachine.js">nanoMachine.js GitHub repository</a></p> <!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[XSD Element content and attributes]]></title><description><![CDATA[This is a problem I encountered whilst working on a configuration XSD at work, I wanted to have a node that had content, with an enforced type along with attributes. An example of the XML I wanted to allow is below. <Quotes> <Quote author="George W. Bush">Rarely is the questioned asked: Is our children learning?</Quote> </Quotes> To specify a type for the node contents as well as some attributes, we need to use a type extension, specifying the type we want for the content as the base type. ]]></description><link>https://ghost.tegud.net/xsd-element-content-and-attributes/</link><guid isPermaLink="false">Ghost__Post__5c7ae46c7a4ee400019aa358</guid><category><![CDATA[xml]]></category><category><![CDATA[xsd]]></category><dc:creator><![CDATA[Steve Elliott]]></dc:creator><pubDate>Fri, 03 Jul 2009 12:39:00 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>This is a problem I encountered whilst working on a configuration XSD at work, I wanted to have a node that had content, with an enforced type along with attributes. An example of the XML I wanted to allow is below.</p> <pre><code><Quotes> <Quote author="George W. Bush">Rarely is the questioned asked: Is our children learning?</Quote> </Quotes> </code></pre> <p>To specify a type for the node contents as well as some attributes, we need to use a type extension, specifying the type we want for the content as the base type. This is shown below.</p> <pre><code><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Quotes"> <xs:complexType> <xs:sequence> <xs:element name="Quote"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="author" type="xs:string" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> </code></pre> <p>The XSD above will permit the XML we want, and allow content and attributes on the same node. I'll probably do some more stuff on XSDs in the near future.</p> <!--kg-card-end: markdown-->]]></content:encoded></item></channel></rss>