<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Igor Perikov on Medium]]></title>
        <description><![CDATA[Stories by Igor Perikov on Medium]]></description>
        <link>https://medium.com/@perikov.igor?source=rss-d3709618dc81------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/2*LuTk3YxatCYIKn1vE63Y0g.jpeg</url>
            <title>Stories by Igor Perikov on Medium</title>
            <link>https://medium.com/@perikov.igor?source=rss-d3709618dc81------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Tue, 23 Jun 2026 17:20:22 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@perikov.igor/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Application Monitoring I Wish I Had Before]]></title>
            <link>https://itnext.io/application-monitoring-i-wish-i-had-before-ce9b305c636c?source=rss-d3709618dc81------2</link>
            <guid isPermaLink="false">https://medium.com/p/ce9b305c636c</guid>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[monitoring-system]]></category>
            <category><![CDATA[microservices]]></category>
            <category><![CDATA[software-development]]></category>
            <dc:creator><![CDATA[Igor Perikov]]></dc:creator>
            <pubDate>Wed, 20 May 2020 19:19:46 GMT</pubDate>
            <atom:updated>2020-05-20T19:51:23.238Z</atom:updated>
            <content:encoded><![CDATA[<p><strong><em>Disclaimer:</em></strong><em> story covers general concepts, which form the basis of convenient monitoring system. Implementation details like Prometheus/Grafana setup properties are out of scope.</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OS1FRduXVyS4oYBj-8sL3A.jpeg" /><figcaption>Photo by <a href="https://unsplash.com/@t__bias?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Tobias Jussen</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></figcaption></figure><p>Recently, I redesigned and reimplemented application monitoring for a couple of production services. I am happy about result, so here’s my key points:</p><h3>#0: Collect enough data</h3><h4>Observe your system from in- and outside</h4><p>Application can’t fully monitor itself by design. For example, web application router didn’t expect a special symbol and silently crashed without incrementing error counter. Unless you have something that can observe server from the outside (reverse proxy, load balancer, client-side), you won’t notice a problem. On the other hand, you can’t observe system only from the outside, as some details are inherently hidden inside.</p><h4>Gather most important sensors in one place</h4><p>Chances are high that in case of incident you’ll get an alert, which says little to nothing about cause, hence won’t help you to figure out how to stabilize your system. That’s why it‘s important to gather significant sensors in one place, where they can be easily viewed.</p><p>Take your time, think about possible problems, review previous incidents and assemble corresponding sensors. Don’t forget to revisit it as your system and knowledge about it grows.</p><h3>#1: Send notifications when things are bad and immediate action is required</h3><h4>Increase alert’s reactivity</h4><p>On incident, alert should be send ASAP — faster reaction means shorter incident’s duration. This heavily depends on how alert’s expression looks like, e.g. calculating average slows down the reaction, as well as calculating metric over a big window.</p><p>Percentiles over relatively small windows should be your go-to choice.</p><h4>Alert should represent breached SLO/User experience</h4><p>Don’t get alerted about some hidden potential precursors, like slight GC timing increase. GC itself is not harmful, <strong>unless your service SLO was breached because of it.</strong></p><p>If your SLO alert is reactive and GC degraded enough to affect users — you’ll get notified anyway. <strong>But when users left unaffected, issue can wait for a while.</strong></p><h3>#2: Don’t send notifications when immediate action is not required</h3><h4>Over-information fallacy</h4><p>Sometimes people decide they want to get notified about every single thing. <strong>Eventually you’ll start ignoring them </strong>and will miss an important one, that’s how brain works. Don’t fall for this.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*f_1ulBtvKA7YNRMdkGt7tA.jpeg" /><figcaption>Photo by <a href="https://unsplash.com/@chairulfajar_?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">@chairulfajar_</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></figcaption></figure><h4>Redesign false-negative alerts</h4><p>Alert which bleeps without a serious reason is the worst. It distracts engineers and make them reluctant in reacting to alerts. This happened to me once and it was the dumbest incident of my life.</p><p>Usually alert is flaky, when at least one of these is true:</p><ul><li>Alert triggers on a single threshold violation</li><li>Metric is first evaluated, then aggregated</li><li>During off peak measurable value shifts towards the tail</li></ul><p>Evaluating data over a window (which gets bigger off peak) or waiting for a few consecutive results should solve a problem.</p><p>Keep in mind it’s completely<strong> opposite to reactivity: bigger window means less reactivity, smaller window means more false-negative alerts.</strong> Hitting balance will take time, but it’s worth it.</p><h4>Consider alert scope granularity</h4><p>If you have a lot of stateless instances being deployed, like 3+ per availability zone or so, you don’t need to be notified when one of them become faulty. System itself should tolerate minor errors. Doing proper capacity planning, setting adequate timeouts, retries, and circuit breakers should make it.</p><p>Alerts configured per-instance are subjects for thorough review.</p><h4>Separate alert channels</h4><p>Although I criticized extensive alerting, it’s crucial to know what’s going on with your system (I can’t stress it enough!), you just don’t need to know it straightaway — use asynchronous channel.</p><h3>#3: Ability to observe trends and take necessary actions in advance</h3><p>Long distance trends such as cpu consumption, connection pool size, and amount of stored data at database doesn’t require immediate actions and can be examined in following ways:</p><ol><li>Manual labour: revisit all dashboards on a daily/weekly schedule. Doesn’t scale well with increasing amount of sensors and applications</li><li>Semi-automatic: clone corresponding alert and set lower threshold. Then attach it to separate channel, where they will be deduplicated, aggregated into single report and sent daily/weekly</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*2Z3IDdt_2qWlGEJpH8juLw.png" /><figcaption>How different alerts relate to each other</figcaption></figure><p>When I incorporated those, it turned out being an on-call engineer doesn’t have to be a nightmare :)</p><p>Thanks for reading, hope you learned something useful!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=ce9b305c636c" width="1" height="1" alt=""><hr><p><a href="https://itnext.io/application-monitoring-i-wish-i-had-before-ce9b305c636c">Application Monitoring I Wish I Had Before</a> was originally published in <a href="https://itnext.io">ITNEXT</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[How APIs Can Benefit From Computational Cost Equality]]></title>
            <link>https://codeburst.io/how-apis-can-benefit-from-computational-cost-equality-b6424b5c24b1?source=rss-d3709618dc81------2</link>
            <guid isPermaLink="false">https://medium.com/p/b6424b5c24b1</guid>
            <category><![CDATA[microservices]]></category>
            <category><![CDATA[software-development]]></category>
            <category><![CDATA[web-development]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[software-architecture]]></category>
            <dc:creator><![CDATA[Igor Perikov]]></dc:creator>
            <pubDate>Fri, 27 Mar 2020 00:44:36 GMT</pubDate>
            <atom:updated>2020-03-27T00:44:36.319Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*9f-gnhpSDUXGgc91kSgSlw.jpeg" /><figcaption>Photo by <a href="https://unsplash.com/@videmusart?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Syd Wachs</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></figcaption></figure><h3>Definition</h3><p>Let me coin a phrase for brevity:</p><blockquote>Homogeneous API — API with equal computational cost of all requests<strong>.</strong></blockquote><p>What’s computational cost? It’s <strong>all resources, which are consumed to produce outcome</strong>, e.g. cpu time, memory, disk space, and network bandwidth. Let’s consider a service, which stores user-specific information, e.g. Medium. It has a <a href="https://medium.com/me/list/queue">reading list</a> feature, which lets you save stories for later. If we design a backend to serve requests in a form of “give me all stories for user A”, the computational cost will vary from user to user, making API heterogeneous.</p><p>Now let’s take a look at the pros of making API homogeneous.</p><h3><strong>Advantages</strong></h3><h4>State-of-the-art observability</h4><p>Deviation of heterogeneous API timings depends on how users behave, how much data they save and consequently request. Thus, if given percentile grow, it’s hard to distinguish, whether the users distribution changed or the service itself slowed down.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*koyfWAhRbx6BJ1ZV9D_O8g.png" /><figcaption>Green squares represent lightweight requests, red — heavyweight</figcaption></figure><p>Conversely, users can’t mess up homogeneous system, therefore <strong>there is a higher correlation between service metrics and code pushes.</strong></p><h4>SLOs are easier to enforce</h4><p>Since metrics fairly represent a service’s health, it reduces false-positive alerts, hence bringing less burden to operations team.</p><h4>Happy users</h4><p>With homogeneous API you can be sure that no user struggles consistently with your service. For example you can monitor a p999 trend to prevent SLA violations, but a “heavy user” would fall in p9999. <strong>Their experience might degrade over time and you won’t notice the problem unless they complain explicitly.</strong></p><p>So, no more “heavy user” tickets, which are harder to investigate and impossible to prevent. Amazing, right?</p><h4>Fault-tolerance configuration</h4><p>To provide sane experience for everyone, <strong>you’re obliged to set timeouts and deadlines as big as the heaviest request takes. </strong>Although it will allow everyone to be served, lightweight requests might be stuck for an unnecessarily long time. If a request usually completes in under 100ms and timeout is 1000ms, the client will waste an extra 900ms on a network bleep. Lesser timeout would’ve let the request be retried and succeeded earlier. Having all requests of similar sizes solves this issue, allowing us to apply those patterns effectively.</p><h3>How to implement</h3><p>Making API homogeneous requires two steps:</p><h4>Pagination</h4><p>Instead of returning all, return N entities at most. Sadly, necessity to change clients is not the only cost which comes with pagination.</p><p>Another one is consistent editing problem. It’s likely to occur with explicit pagination, when a user can see page numbers and navigate over them. The simplest approach to implement pagination is to calculate static indexes via “skip” and “limit”, thus page number P of size S would have indexes (math notation):</p><pre>(S*(P — 1), S*P]</pre><p>Now, imagine a situation, when a client deletes an entity from one page and goes forward:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*yvIxBR3ZovLiHte4pN-Ghg.png" /><figcaption>Inconsistent editing example</figcaption></figure><p>Since the user deleted entity #3, the entities from page #2 shift to the left, effectively losing entity #4. To observe it user has to go backwards or reload a page after deletion. <strong>Instead of fetching pages by statically calculated indexes, retrieve S entities after N, where N is an identifier of last entity at current page. </strong>When jumping more than one page away, calculating indexes is fine.</p><p>Choose page size wisely, setting it to 1000 if median objects count is 25 won’t make an API homogeneous. <strong>Examine clients to figure out how many objects can be observed on the screen at the same time.</strong> Generally, multiplying this number by 2 to 5 gives a reasonable page size.</p><p>Getting back to the Medium “reading list”: if you take a closer look at it with network inspector, you’ll see an implicit pagination (pages are being requested as you scroll down) with page size = 10. So, their “reading list” API is homogeneous (others probably too, but I didn’t collect any evidence)</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*0dSZXs8vf99O2tMGfWYZnQ.png" /></figure><h4>Isolation</h4><p>It’s a paradigm where every service serves one and only one purpose. For example, if you have a service with 2 endpoints, restricting their thread pools won’t give perfect isolation. Today, 60/40 ratio represents user distribution, but tomorrow’s workload might change and one of the functions would starve. Plus, it leads to unnecessary underutilization.</p><p>Isolation is easier to achieve if you have an application template and matured CI/CD/deploy infrastructure.</p><p>I am aware of specific domains, where making api homogeneous is impossible or insignificant. To name a few — query engines and math evaluation programs. Nonetheless, I believe it’s useful and feasible for most web scenarios.</p><h4>Resources</h4><ul><li><a href="https://landing.google.com/sre/sre-book/chapters/load-balancing-datacenter/">https://landing.google.com/sre/sre-book/chapters/load-balancing-datacenter/</a>, section “Varying query costs”</li><li><a href="https://landing.google.com/sre/sre-book/chapters/service-level-objectives/">https://landing.google.com/sre/sre-book/chapters/service-level-objectives/</a></li><li><a href="https://itnext.io/5-patterns-to-make-your-microservice-fault-tolerant-f3a1c73547b3">https://itnext.io/5-patterns-to-make-your-microservice-fault-tolerant-f3a1c73547b3</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=b6424b5c24b1" width="1" height="1" alt=""><hr><p><a href="https://codeburst.io/how-apis-can-benefit-from-computational-cost-equality-b6424b5c24b1">How APIs Can Benefit From Computational Cost Equality</a> was originally published in <a href="https://codeburst.io">codeburst</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[5 patterns to make your microservice fault-tolerant]]></title>
            <link>https://itnext.io/5-patterns-to-make-your-microservice-fault-tolerant-f3a1c73547b3?source=rss-d3709618dc81------2</link>
            <guid isPermaLink="false">https://medium.com/p/f3a1c73547b3</guid>
            <category><![CDATA[fault-tolerance]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[microservices]]></category>
            <category><![CDATA[distributed-systems]]></category>
            <category><![CDATA[cloud-computing]]></category>
            <dc:creator><![CDATA[Igor Perikov]]></dc:creator>
            <pubDate>Wed, 08 Jan 2020 13:15:07 GMT</pubDate>
            <atom:updated>2020-01-09T11:26:38.496Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*79TNR1YkcKzehlGf3e_STQ.jpeg" /><figcaption>Photo by <a href="https://unsplash.com/@veverkolog?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Dušan Smetana</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></figcaption></figure><p>In this article I’ll cover fault tolerance in microservices and how to achieve it. If you look it up on wikipedia, you will find following definition:</p><blockquote><strong>Fault tolerance</strong> is the property that enables a system to continue operating properly in the event of the failure of some of its components.</blockquote><p>For us, <em>a component</em> means anything: microservice, database(DB), load balancer(LB), you name it. I won’t cover DB/LB fault-tolerance mechanisms, because they are vendor-specific and enabling them ends up setting some property or changing deploy policy.</p><p>As a software engineers, <strong>applications is where we have all the power and responsibility</strong>, so let’s take care of it. Here’s the list of patterns, I am going to cover:</p><ul><li>Timeouts</li><li>Retries</li><li>Circuit Breaker</li><li>Deadlines</li><li>Rate limiters</li></ul><p>Some of the patterns are widely known, you might even doubt they worth mentioning, but stick to the article — I’ll cover basic forms briefly, then <strong>discuss their flaws and how to overcome them.</strong></p><h3>Timeouts</h3><p>Timeout is a specified period of time which is allowed to wait for some event to occur. There is a problem if you are using SO_TIMEOUT (also known as socket timeout or read timeout) — it represents timeout between any two consecutive data packets, not for the whole response, so it’s harder to enforce SLA, especially when response payload is big. <strong>What you usually want is timeout, which covers whole interaction from establishing connection to the very last byte of the response. </strong>SLA usually described in such timeouts, because they are humane and natural to us. Sadly, they doesn’t fit SO_TIMEOUT philosophy. To overcome it in JVM world you can use <a href="https://docs.oracle.com/en/java/javase/11/docs/api/java.net.http/java/net/http/HttpClient.html">JDK11</a> or <a href="https://square.github.io/okhttp/">OkHttp</a> client. Go has a mechanisms in std library too.</p><p>If you want to dig in — check my previous <a href="https://itnext.io/why-i-like-go-http-client-as-a-java-developer-676ea1e698b4">article</a>.</p><h3>Retries</h3><p>If your request failed — wait a bit and try again. That’s basically it, retrying makes sense, because network might degrade for a moment or GC hit that particular instance your request came to. Now, imagine having chain of microservices like that:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/1*vLzPABNtUIos5uqcOobxWw.png" /></figure><p>What happens if we set number of total attempts to 3 at every service and service D suddenly starts serving 100% of errors? It will lead to a retry storm — a situation when every service in chain starts retrying their requests, therefore drastically amplifying total load, <strong>so B will face 3x load, C — 9x and D — 27x! </strong>Redundancy is one of the key principles in achieving high-availability, but I doubt you would have enough free capacity on clusters C and D in that case. Setting total tries to 2 doesn’t help much either, plus it makes user experience worse on small blips.</p><p>Solution:</p><ul><li>Distinguish retryable errors from non-retryable. It’s pointless to retry request, when user doesn’t have permissions or payload doesn’t structured properly. Contrary, <strong>retrying request timeouts or 5xx is good.</strong></li><li>Adopt error budgeting — technique, when you <strong>stop making retries if rate of retryable errors exceeds threshold</strong>, e.g. if 20% of interactions with service D results in error, stop retrying it and try to degrade gracefully. Amount of errors might be tracked with rolling window over N last seconds.</li></ul><h3>Circuit Breaker</h3><p>Circuit breaker can be explained as a stricter version of error budgeting — when errors rate is too high, function won’t be executed at all and will return fallback result, if provided. Very small portion of requests should be executed anyway in order to understand if 3rd party recovered or not. <strong>What we want is to give 3rd party a chance to recover without any manual work done.</strong></p><p>You might argue, that it doesn’t make sense to enable circuit breaker if function is on critical path, but bear in mind, that this short and controlled ‘outage’ is likely to prevent a big and uncontrollable one.</p><p>Although circuit breaker and error budgeting share similar ideas, it makes sense to configure both of them. Since error budgeting is less disruptive its threshold must be smaller.</p><p><a href="https://github.com/Netflix/Hystrix">Hystrix</a> was a go-to circuit breaker implementation in JVM for a long time. As of now, it entered <a href="https://github.com/Netflix/Hystrix#hystrix-status">maintenance mode</a>, advising to use <a href="https://github.com/resilience4j/resilience4j">resilience4j</a> instead.</p><h3>Deadlines/distributed timeouts</h3><p>We’ve discussed timeouts in the first part of this article, now let’s see how we can make them ‘distributed’. First, revisit same chain of services calling each other:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1010/1*8_Gd5oRbDCnoRvH1kzSPtw.png" /></figure><p>Service A willing to wait at most 400ms and request requires some work to be done from all 3 downstream services. Assume that service B took 400 ms and now ready to call service C. Is that reasonable at all? No! Service A timeout’ed and doesn’t wait for the result anymore. <strong>Proceeding further will only waste resources</strong> and increase susceptibility to retry storms.</p><p>To implement it, we must add extra metadata to a request, that will help to understand, when it’s reasonable to interrupt processing. Ideally, this should be supported by all participants and being passed throughout the system.</p><p>On practice this metadata is one of the following:</p><ul><li><strong>Timestamp</strong>: pass point of time at which your service will stop waiting for response. First, gateway/frontend service sets deadline to ‘<em>current timestamp + timeout’. </em>Next, any downstream service should check if current timestamp ≥ deadline. If the answer is yes, then it’s safe to shut it down, otherwise — start processing. Unfortunately, there is a problem with <a href="https://en.wikipedia.org/wiki/Clock_skew">clock skew</a>, when machines can have different clock time. If it occurs, requests will stuck or/and get rejected immediately, causing outage to happen.</li><li><strong>Timeout</strong>: pass the amount of time service allowed to wait for. This is a bit trickier to implement. Same as before you set deadline as soon as possible. Next, any downstream service should calculate how much time does it spend, subtract it from the inbound timeout and pass to the next participant. <strong>It’s crucial not to forget time spent waiting in the queue!</strong> So, if service A is allowed to wait 400ms and service B spent 150ms, it must append 250ms deadline timeout, when calling service C. Although it doesn’t count time spent on the wire, deadline can only be triggered later, not earlier, thus, potentially consuming slightly more resources, but not spoiling the outcome. Deadlines are implemented this way in <a href="https://grpc.io/">GRPC</a>.</li></ul><p>Last thing to discuss is — does it ever makes sense not to interrupt call chain, when deadline is exceeded? The answer is yes, <strong>if your service have plenty of free capacity and completing request will make it hotter (cache/JIT), it’s okay to keep processing.</strong></p><h3>Rate limiters</h3><p>Previously discussed patterns mostly solves problem of cascading failures — situation when dependent service collapse after its dependency collapsed, eventually leading to full shutdown. Now, let’s cover situation, when your service is overloaded. There are plenty of tech and domain-specific reasons why it might happen, just assume it happened.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*3jL9q0luHheQs-d-bucbCQ.jpeg" /><figcaption>Photo by <a href="https://unsplash.com/@mrthetrain?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Joshua Hoehne</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></figcaption></figure><p>Every application has its unknown capacity. <strong>This value is dynamic and depends on multiple variables </strong>— such as recent code changes, model of CPU application running on right now, busyness of host machine, etc.</p><p>What happens when load surpass capacity? Usually, this vicious cycle occurs:</p><ol><li>Response time grows, GC footprint increases</li><li>Clients get more timeouts, even more load arrives</li><li>goto 1, but more severe</li></ol><p>This is an example of what might happen. Sure, if clients have error budgeting/circuit breaker, 2nd item might not create extra load, thereby give a chance to leave this cycle. Other things might happen instead — removing instance from LB’s upstream list might create more inequality in load and shut neighbor instances and so on.</p><p>Limiters to the rescue! Their idea is to shed incoming load gracefully. This is how excessive load should be handled ideally:</p><ol><li><strong>Limiter drops extra load above capacity, thus lets application serve requests in compliance with SLA</strong></li><li>Excessive load redistributes to other instances/cluster auto-scales/cluster gets scaled by a human</li></ol><p>There are 2 types of limiters — rate and concurrency, former restricts inbound RPS, latter restricts amount of requests being processed at any moment of time.</p><p>For the sake of simplicity, I’ll make an <strong>assumption that all requests to our services are nearly equal in computational cost and have same importance.</strong> Computational inequality arise from the fact, that different users can have different amount of data associated with them, e.g. favorite TV series or previous orders. Usually, embracing pagination helps to achieve computational equality of requests.</p><p>Rate limiter is more widely used, but doesn’t provide as strong guarantees as concurrency limit does, so if you wish to choose one, stick with concurrency limit and here’s why.</p><p>When configuring rate limiter, we think we enforce following:</p><blockquote>This service can process N requests per second at any point of time.</blockquote><p>But what we actually declare is this:</p><blockquote><strong>Assuming that response time won’t change, </strong>this service can process N requests per second at any point of time.</blockquote><p>Why this remark is important? I’ll ‘prove’ it by intuition. For those willing to have math-based proof — check <a href="https://en.wikipedia.org/wiki/Little%27s_law">Little’s law</a>. Assuming rate limit is 1000 RPS, response time is 1000ms and SLA is 1200ms, we easily serve exactly 1000 requests in a second under given SLA.</p><p>Now, response time grew by 50ms (dependency service started doing extra work). Every second from now on service will face more and more requests being processed at the same time, because arrival rate is bigger than service rate. Having unlimited amount of workers means <strong>you will run out of resources and collapse, especially in environments, where workers map 1:1 to OS threads</strong>. How concurrency limit with 1000 workers would handle it? It will reliably serve 1000/1.05 = ~950 RPS without SLA violation and drop the rest. Also, no reconfiguration needed to catch up!</p><p>We can update rate limit every time dependency has changed, but this is immensely big burden, potentially requiring whole ecosystem to be reconfigured on every change.</p><p>Depending on how limit value is being set, it’s either static or dynamic limiter.</p><h4>Static</h4><p>In this case limit is configured manually. Value can be assessed by regular performance tests. Although, it won’t be 100% accurate, it can be pessimized for safety. <strong>This type of limiting requires work to be done around CI/CD pipelines and has lower resources utilization.</strong> Static limiter can be implemented by restricting size of workers thread pool (concurrency only), by adding inbound filter which counts requests, by <a href="https://docs.nginx.com/nginx/admin-guide/security-controls/controlling-access-proxied-http/">NGINX limiting functionality</a> or by <a href="https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/cluster/circuit_breaker.proto.html">envoy sidecar proxy</a>.</p><h4>Dynamic</h4><p>Here, limit depends on metric, which is re-calculated on regular basis. <strong>Chances are high, that there is a correlation for your service between being overloaded and growth in response time.</strong> If so, metric can be a statistics function over response times, e.g. percentile, medium or average. Remember computation equality property? This property is a key to more accurate calculations.</p><p>Then, define a predicate which will answer, whether metric is healthy. For example, having p99 ≥ 500ms considered unhealthy, thus limit should be decreased. How limit is increased and decreased should be decided by an apply feedback control algorithm, like <a href="https://en.wikipedia.org/wiki/Additive_increase/multiplicative_decrease">AIMD</a> (which is used in TCP protocol). Here’s pseudocode for it:</p><pre>if healthy {<br>    limit = limit + increase;<br>} else {<br>    limit = limit * decreaseRatio; // 0 &lt; decreaseRatio &lt; 1.0<br>}</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*59j2XNxrJg4IElYxVHdPsg.png" /><figcaption>AIMD in action</figcaption></figure><p>As you can see, limit grows slowly, probing if application is doing good, and decreases steeply if faulty behavior is found.</p><p>Netflix has pioneered idea of dynamic limits and open-sourced their solution, here’s <a href="https://github.com/Netflix/concurrency-limits">repo</a>. It has implementations of several feedback algorithms, static limiter implementation, GRPC integration and Java servlet integration.</p><p>Huh, that’s it! I hope you learnt something new and useful today. I’d like to note that <strong>this list is not exhaustive, you would also want to achieve good observability</strong>, cause something unexpected might happen and it’s better to understand what’s going on with your application at the moment. Nevertheless, implementing those will solve vast amount of your current or potential problems.</p><h4>References</h4><ul><li><a href="https://landing.google.com/sre/sre-book/toc/index.html">Google SRE book</a>, especially chapters <a href="https://landing.google.com/sre/sre-book/chapters/addressing-cascading-failures/">Addressing Cascading Failures</a> and <a href="https://landing.google.com/sre/sre-book/chapters/handling-overload/">Handling Overload</a></li><li><a href="https://medium.com/@NetflixTechBlog/performance-under-load-3e6fa9a60581">Netflix article about dynamic limits</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f3a1c73547b3" width="1" height="1" alt=""><hr><p><a href="https://itnext.io/5-patterns-to-make-your-microservice-fault-tolerant-f3a1c73547b3">5 patterns to make your microservice fault-tolerant</a> was originally published in <a href="https://itnext.io">ITNEXT</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Why I like Go HTTP client as a Java developer]]></title>
            <link>https://itnext.io/why-i-like-go-http-client-as-a-java-developer-676ea1e698b4?source=rss-d3709618dc81------2</link>
            <guid isPermaLink="false">https://medium.com/p/676ea1e698b4</guid>
            <category><![CDATA[java]]></category>
            <category><![CDATA[golang]]></category>
            <category><![CDATA[http-request]]></category>
            <category><![CDATA[programming]]></category>
            <dc:creator><![CDATA[Igor Perikov]]></dc:creator>
            <pubDate>Tue, 26 Nov 2019 07:46:25 GMT</pubDate>
            <atom:updated>2019-11-26T07:46:25.260Z</atom:updated>
            <content:encoded><![CDATA[<p>We live in an era of microservices. Those have <strong>A LOT </strong>of communication inside their ecosystems. You might have seen so-called death stars of microservices:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/947/1*mZkidifrgL4zIcJKv5_uCQ.jpeg" /><figcaption>image by <a href="http://www.csl.cornell.edu/~delimitrou/papers/2019.asplos.microservices.pdf">Cornell University</a> researchers</figcaption></figure><p>Every single click in modern internet triggers multitudes of network calls and, as you probably know, <a href="https://blog.acolyer.org/2014/12/18/the-network-is-reliable/">network is unreliable</a>. This is one of the reasons <strong>you should set request timeouts </strong>when fetching something over the wire. It’s better to incorporate some other techniques as well, but that’s the topic for another article (and even whole book).</p><h4>Why bother about timeouts?</h4><p>Setting them is very important, because network might fail, your instance might get slow or crash. You don’t want users to wait indefinitely just because 1 of your 1000 servers suddenly shut down. People hate waiting, it affects their happiness and, therefore, <a href="https://www.machmetrics.com/speed-blog/how-does-page-load-time-affect-your-site-revenue/">your revenue</a>. If network call gets stuck, you should abandon it, retry and, unless your cluster is having problems, it most likely to succeed.</p><p>Let’s see how we can implement this in Java and Go.</p><h4>Common Java approach</h4><p>Most popular http client in java is <a href="https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient">Apache HttpClient</a>. Here’s how configuring request timeouts will look like:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/55a1fd1b170f5b788fd04193e090e343/href">https://medium.com/media/55a1fd1b170f5b788fd04193e090e343/href</a></iframe><p>Those 3 are all different and independent. Biggest problem we’re facing here is how to represent 500 ms as 3 different timeouts — should it be 100 + 100 + 300 or 50 + 50 + 400? And, more importantly, do we even care if connect timeout will take 200 ms? Imagine that server will complete request in 50 ms, so total response time would be 250 ms, in most situations — <strong>you don’t care, it’s completely fine!</strong></p><p>On the other side you can’t set timeouts much bigger, because it will lead to a longer requests. Also, socket timeout is just a timeout between any consecutive packets being read from the socket, <strong>not the whole response </strong>being sent back to you.</p><p>Nonetheless, it’s all we can do using Apache HttpClient. Let’s take a look at Go.</p><h4>Go standard library</h4><p>Go, however, has support for whole client call timeout:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/b99f80398e6213ee30ada9e52e60acd8/href">https://medium.com/media/b99f80398e6213ee30ada9e52e60acd8/href</a></iframe><p>Here we’re setting call timeout to 2 seconds and calling <a href="http://httpbin.org">httpbin</a>, which will imitate work for 1 second long and return a response after. Launching this will lead to a successful call.</p><p>If we set call timeout to 1 second it will fail with a message similar to given, which is exactly what we are looking for!</p><pre>2019/11/25 19:06:09 Get <a href="http://httpbin.org/delay/1">http://httpbin.org/delay/1</a>: net/http: request canceled (Client.Timeout exceeded while awaiting headers)</pre><p>On the contrary, configuring call timeout only <strong>is not always the best we can achieve</strong>. Imagine having a long and heavy request which takes 10 seconds to complete. It would be nasty to wait 10 seconds for response and figure out that servers was trying to establish connection all this time and no actual work was done.</p><p>Solution to this? — Connect timeout. Yes, the one we were sort of blaming earlier. Combined with call timeout it gives robust protection to your network interactions:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/e124dbe14bbc6607cdf0a53f6be92406/href">https://medium.com/media/e124dbe14bbc6607cdf0a53f6be92406/href</a></iframe><p>Exceeding connect timeout will give us desirable behavior:</p><pre>2019/11/25 19:12:25 Get <a href="http://httpbin.org/delay/5">http://httpbin.org/delay/5</a>: dial tcp 54.172.95.6:80: i/o timeout</pre><p>Go gives you a lot of flexibility with it’s standard HTTP client, and library is supported by the vendor — perfect combination, <strong>that’s why I like it very much!</strong> Now let’s get back to Java world one more time.</p><h4>So, is Java doomed?</h4><p><strong>No.</strong> There are less popular alternatives, such as <a href="https://docs.oracle.com/en/java/javase/11/docs/api/java.net.http/java/net/http/HttpClient.html">JDK 11</a> http client and <a href="https://square.github.io/okhttp/">OkHttp</a>, which supports call timeout features. Sadly, those are not the top results when searching on Google, so people are less aware of them or not willing to start using them.</p><h4>JDK 11</h4><p>It is a modern http client delivered with standard library, which supports HTTP/1.1, HTTP/2, async calls via CompletableFuture and provides convenient api to work with. Let’s combine both timeouts with it:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7e34d7e028ae7a1f7dd2c061e6708dc9/href">https://medium.com/media/7e34d7e028ae7a1f7dd2c061e6708dc9/href</a></iframe><p>This client will provide nice and easy to understand error messages(especially compared to Go) for connect and call timeouts respectively:</p><pre>Exception in thread &quot;main&quot; java.net.http.HttpConnectTimeoutException: HTTP connect timed out</pre><pre>Exception in thread &quot;main&quot; java.net.http.HttpTimeoutException: request timed out</pre><h4>OkHttp</h4><p>OkHttp supports call and connect timeouts too. Both can be configured at client abstraction level, which is slightly more convenient compared to JDK11 implementation:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/d4c6061d266dbfa05149cb0c52553901/href">https://medium.com/media/d4c6061d266dbfa05149cb0c52553901/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*yXo-_LI9yrR_ze2az0XpMg.jpeg" /><figcaption>Photo by <a href="https://unsplash.com/@agebarros?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Agê Barros</a> on <a href="https://unsplash.com/s/photos/timer?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></figcaption></figure><h4>How to choose timeout values, anyway?</h4><p>Last thing I’d like to mention is how to derive those 2 numbers. Call timeout is more about SLA/SLO you have with other services and connect timeout is about expectations from underlying network. For example, if you’re sending requests to the same datacenter, then 100ms would be fine (although it should establish connection faster than 5ms), but operating on top of mobile networks (which are more error-prone) will require higher connect timeouts.</p><h4>Wrap-up</h4><p>In this article we discussed why timeouts are important, why ‘classic’ timeouts doesn’t meet modern requirements and which instruments to use to force them. I hope you found something useful in it.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=676ea1e698b4" width="1" height="1" alt=""><hr><p><a href="https://itnext.io/why-i-like-go-http-client-as-a-java-developer-676ea1e698b4">Why I like Go HTTP client as a Java developer</a> was originally published in <a href="https://itnext.io">ITNEXT</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Open source made easier]]></title>
            <link>https://itnext.io/open-source-made-easier-aab8aebd849f?source=rss-d3709618dc81------2</link>
            <guid isPermaLink="false">https://medium.com/p/aab8aebd849f</guid>
            <category><![CDATA[github]]></category>
            <category><![CDATA[kotlin]]></category>
            <category><![CDATA[hacktoberfest]]></category>
            <category><![CDATA[open-source]]></category>
            <dc:creator><![CDATA[Igor Perikov]]></dc:creator>
            <pubDate>Tue, 01 Oct 2019 06:44:06 GMT</pubDate>
            <atom:updated>2019-10-10T10:54:31.831Z</atom:updated>
            <content:encoded><![CDATA[<h3><strong>TL;DR</strong></h3><p>I made a program that finds, among your starred repositories, the issues that are most likely to be a good choice for making a contribution. For a quick start, check out the How-To on the <a href="https://github.com/igorperikov/mighty-watcher#how-to-use">GitHub project page</a>.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/978/1*ojMfsXBgceCLeoDfoDHGqg.gif" /></figure><h3>Why would one need it?</h3><p><strong>Developers are obsessed</strong> with open source development nowadays. Everyone wants to give back to the community, to get some new skills and experience while working on the most widely known software products. Imagine getting invaluable insights from experienced engineers all over the world: pretty exciting, right?</p><p>However, people struggle hard to find appropriate tasks to work on. If you are new to the project, it is most likely that the maintainers won’t let you work on some big, fancy feature, lest you miss a plenty of details and corner cases. Everyone agrees that <strong>one would better start with the easiest tasks</strong>, learn the ropes, get some trust from maintainers, and then proceed to harder ones.</p><h3>So, how do you find these easy issues to work on?</h3><p>Some time ago, GitHub started listing appropriate issues, e.g., <a href="https://github.com/golang/go/contribute">https://github.com/golang/go/contribute</a>. You can access this list from the header of the Issues page. This is how it looks like:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*SieFNepKy7BwVtRnMRB8EQ.png" /><figcaption>They’ve collected some good first issues for you!</figcaption></figure><p>Although it is pretty accurate, it sometimes lacks the appropriate labeling or misses some items.</p><p>My usual <em>“let-me-make-a-contribution”</em> day used to look like this: I was looking through my starred repositories and jumping to this “new contribution” section, or was scrolling the issues list manually. That was a very monotonous and time-consuming task, so I used to get bored quickly by this process and would give up. That is when I decided to automate it.</p><h3>Solution</h3><p>First, I needed to work out the rules of the game. I did some research and came up with the following heuristics:</p><ul><li>An issue should not be closed.</li><li>Assigned means assigned, don’t waste your time on it.</li><li>An issue should be marked with an “easy” label or something similar.</li><li>Old issues are likely to be outdated; if it remains unsolved for too long, maintainers are likely not interested in it and probably less interested to help you during the process.</li></ul><p>Furthermore, it turns out that every repository uses its own set of labels, <strong>and it is rarely just “help wanted”</strong>. There are multitudes of them. I looked through all the labels in my starred repositories and other popular ones and came up with a <a href="https://github.com/IgorPerikov/mighty-watcher/blob/master/src/main/kotlin/com/github/igorperikov/mightywatcher/service/EasyLabelsStorage.kt">final list of ~60 labels</a>. <strong>If any of your repositories have different labels, create a pull request or contact me</strong> — I will add it as soon as possible. As I set boundaries, it was the time to create the program itself.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*f9Is6VeFUr1jQYsYeSCfnA.png" /></figure><p>I built a Kotlin application, which looks through the list of starred repositories and searches for the issues with predefined labels, sorts them and has them printed to the console. The current way of distribution is a Docker image — people are more likely to have it installed than Java ;)</p><p>As the labels list grew, I noticed a drastic decrease in speed and an increase in the consumption of API usage limits. The first edition of the tool was searching for issues with all given labels, although 95% of them had never been used in that repo! That was my “AHA!” moment, so I added an additional call to fetch all the labels from the repository and intersect them with my list. I got a significant improvement of the performance, and utilized less of the API limits.</p><p>On a daily basis, the tool works fine as a guide to the world of open-source contributions, it will also help you to achieve your <a href="https://hacktoberfest.digitalocean.com/">Hacktoberfest</a> goals. To try it, visit the project page and follow the <a href="https://github.com/igorperikov/mighty-watcher#how-to-use">instructions</a>. If you lack starred repositories — I have some <a href="https://github.com/igorperikov/mighty-watcher#lacking-starred-repositories">advices</a> for you as well.</p><p>If you have something to say — do it in the section below, or in the dedicated <a href="https://github.com/IgorPerikov/mighty-watcher/issues/67">feedback issue</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=aab8aebd849f" width="1" height="1" alt=""><hr><p><a href="https://itnext.io/open-source-made-easier-aab8aebd849f">Open source made easier</a> was originally published in <a href="https://itnext.io">ITNEXT</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>