Fine Tune Your Polling and Batching in Mule ESB

They say it's best to learn from others. With that in mind, let's dive into a use case I recently ran into. We were dealing with a number of legacy systems when our company decided to shift to a cloud-based solution. Of course, we had to prepare for the move — and all the complications that came with it.Use CaseWe have a legacy system built with Oracle DB using Oracle forms to create applications and lots and lots of stored procedures in the database. It's also been in use for over 17 years now with no major upgrades or changes. Of course, there have been a lot of development changes over these 17 years that taken the system close to the breaking point and almost impossible to implement something new. So, the company decided to move to CRM (Salesforce) and we needed to transfer data to SF from our legacy database. However, we couldn't create or make any triggers on our database to send real-time data to SF during the transition period.SolutionSo we decided to use Mule Poll to poll our database and get the records in bulk, then send them to SF using the Salesforce Mule connector.I am assuming that we all are clear about polling in general. If not, please refer to references at the end. Also, if you are not clear with Mule polling implementation there are few references at the bottom, too. Sounds simple enough doesn't it? But wait, there are few things to consider.What is the optimum timing of the poll frequency of your polls?How many threads of each poll you want to have? How many active or inactive threads do you want to keep?.How many polls can we write before we break the object store and queue store used by Mule to maintain your polling?What is the impact on server file system if you use watermark values of the object store?How many records can we fetch in one query from the database?How many records can we actually send in bulk to Salesforce using SFDC?These are few, if not all the considerations you have to do before implementation. The major part of polling is the WATERMARK of polling and how Mule implements the watermark in the server.Polling for Updates Using WatermarksRather than polling a resource for all its data with every call, you may want to acquire only the data that has been newly created or updated since the last call. To acquire only new or updated data, you need to keep a persistent record of either the item that was last processed, or the time at which your flow last polled the resource. In the context of Mule flows, this persistent record is called a watermark.To achieve the persistency of watermark, Mule ESB will store the watermarks in the object store of the runtime directory of a project in the ESB server. Depending on the type of object store you have implemented, you may have a SimpleMemoryObjectStore or TextFileObjectStore, which can be configured like below: Below is a simple memory object store sample: Below is text file object store sample: For any kind of object store, Mule ESB creates files in-server, and if the frequency of your polls are not carefully configured, then you may run into file storage issues on your server. For example, if you are running your poll every 10 seconds with multiple threads, and your flow takes more than 10 seconds to send data to SF, then a new object store entry is made to persist the watermark value for each flow trigger, and we will end up with too many files in the server object store.To set these values, we have consider how many records we are fetching from the database, as SF has limit of 200 records that you can send in one bulk. So, if you are fetching 2,000 records, then one batch will call SF 10 times to transfer  these 2,000 records. If your flow takes five seconds to process 200 records, including the network transfer to send data to SF and come back, then your complete poll will take around 50 seconds to transfer 2,000 records.If our polling frequency is 10 seconds, it means we are piling up the object store.Another issue that will arise is the queue store. Because the frequency and execution time have big gaps, the queue store's will also keep queuing. Again, you have to deal with too many files.To resolve this, it’s always a good idea to fine-tune your execution time of the flow and frequency to keep the gap small. To manage the threads, you can use Mule's batch flow threading function to control how many threads you want to run and how many you want to keep active.I hope few of the details may help you set up your polling in a better way.There are few more things we have to consider. What happens when error occurs while sending data? What happens when SF gives you error and can't process your data? What about the types of errors SF will send you? How do you rerun your batch with the watermark value if it failed? What about logging and recovery? I will try to cover these issues in a second blog post.Refrences: Read more

Destroy All IFs: A Perspective From Functional Programming

The Anti-IF Campaign currently stands at 4,136 signatures, and there’s a reason: Conditional statements are frequently a troublesome source of bugs and brittle logic, and they make reasoning about code difficult because they multiply the code paths. The problem is not necessarily with conditionals: It's with the boolean values that are required to use conditionals. A boolean value reduces a huge amount of information to a single bit (0 or 1). Then on the basis of that bit, the program makes a decision to take one path or a totally different one.What could possibly go wrong?The Root ProblemI've long argued that, despite all its flaws, one of the great things about functional programming is its intrinsic inversion of control.In an impure imperative language, if you call some function void doX(State *state), the function can do anything it wants. It can modify state (both passed in and global), it can delete files, it can read from the network, and it can even launch nukes!In a pure functional language, however, if you call some function doX :: State -> IO (), then at most, it's going to return a value. It can't modify what you pass it, and if you like, you can ignore the value returned by the function, in which case calling the function has no effect (aside from sucking up a little CPU and RAM).Now granted, IO in Haskell is not a strong example of an abstraction that has superior reasoning properties, but the fundamental principle is sound: In purely functional programming languages, the control is inverted to the caller.Thus, as you're looking at the code, you have a clearer sense of what the functions you are calling can do, without having to unearth their foundations.I think this generalizes to the following principle:Code that inverts control to the caller is generally easier to understand.(I actually think this is a specialization of an even deeper principle — that code that destroys information is generally harder to reason about than code that does not.)Viewed from this angle, you can see that if we embed conditionals deep into functions, and we call those functions, we have lost a certain amount of control: We feed in inputs, but they are combined in arbitrary and unknown ways to arrive at a decision (which code path to take) that has humongous ramifications on how the program behaves.It's no wonder that conditionals (and with them, booleans) are so widely despised!A Closer LookIn an object-oriented programming language, it's generally considered a good practice to replace conditionals with polymorphism.In a similar fashion, in functional programming, it's often considered good practice to replace boolean values with algebraic data types.For example, let's say we have the following function that checks to see if some target string matches a pattern:match :: String -> Boolean -> Boolean -> String -> Bool match pattern ignoreCase globalMatch target = ...(Let's ignore the boolean return value and focus on the two boolean parameters.)Looking at the match function, you probably see how its very signature is going to cause lots of bugs. Developers (including the one who wrote the function!) will confuse the order of parameters and forget the meaning.One way to make the interface harder to misuse is to replace the boolean parameters with custom algebraic data types:data Case = CaseInsensitive | CaseSensitive data Predicate = Contains | Equals match :: String -> Case -> Predicate -> String -> Bool match pattern cse pred target = ...By giving each parameter a type, and by using names, we force the developers who call the function to deal with the type and semantic differences between the second and third parameters.This is a big improvement to be sure, but let's not kid ourselves: Both Case and Predicate are still "boolean" values that have 1 bit of information each, and inside of match, big decisions will be made based on those bits!We've cut out some potential for error, but we haven't gone as far as our object-oriented kin because our code still contains conditionals.Fortunately, there's a simple technique you can use to purge almost all conditionals from your code base. I call it, replace conditional with lambda.Replace Conditional With LambdaWhen you're tempted write a conditional based on boolean values that are passed into your function, I encourage you to rip out those values and replace them with a lambda that performs the effect of the conditional.For example, in our preceding match function, somewhere inside, there's probably a conditional that looks like this:let (pattern', target') = case cse of CaseInsensitive -> (toUpperCase pattern, toUpperCase target) CaseSensitive -> (pattern, target)That normalizes the case of the strings based on the sensitivity flag.Instead of making a decision based on a bit, we can pull out the logic into a lambda that's passed into the function:type Case = String -> String data Predicate = Contains | Equals match :: String -> Case -> Predicate -> String -> Bool match pattern cse pred target = ... let (pattern', target') = (cse pattern, cse target)In this case, because we are accepting a user-defined lambda that we are then applying to user-defined functions, we can actually perform a further refactoring to create a match combinator:type Matcher = String -> Predicate -> String -> Bool caseInsensitive :: Matcher -> Matcher match :: MatcherNow a user can choose between a case-sensitive match with the following code:match a b c -- case sensitive caseInsensitive match a b c -- case insensitiveOf course, we still have another bit and another conditional: the predicate flag. Somewhere inside the match function, there’s a conditional that looks at Predicate:case pred of Contains -> contains pattern' target' Equals -> eq pattern' target'This can be extracted into another lambda:match :: String -> (String -> Bool) -> String -> BoolNow you can test strings like so:caseInsensitive match "foo" eq "foobar" -- false caseInsensitive match "fOO" contains "foobar"Of course, now the function match has been simplified so much, it no longer needs to exist, but in general, that won't happen during your refactoring.Note that as we performed this refactoring, the function became more general-purpose. I consider that a side-benefit of the technique, which replaces highly-specialized code with more generic code.Now let's go through a real world case instead of a made-up toy example.psc-publishThe PureScript compiler has a publish tool with a function defined approximately as follows:publish :: Bool -> IO () publish isDryRun = if isDryRun then do _ <- unsafePreparePackage dryRunOptions putStrLn "Dry run completed, no errors." else do pkg <- unsafePreparePackage defaultPublishOptions putStrLn (A.encode pkg)The purpose of publish is either to do a dry-run publish of a package (so the user can make sure it works)or to publish the package for real.Currently, the function accepts a boolean parameter, which indicates whether it's a dry-run. The function branches off this bit to decide the behavior of the program.Let's unify the two code branches by extracting out the options, and introducing a lambda to handle the different messages printed after package preparation:type Announcer = String -> IO String dryRun :: Announcer dryRun = const (putStrLn "Dry run completed, no errors.") forReal :: Announcer forReal = putStrLn (A.encode pkg) publish :: PublishOptions -> Announcer -> IO () publish options announce = unsafePreparePackage options >>= announceThis is better, but it's still not great because we have to feed two parameters into the function publish, and we've introduced new options that don’t make sense (publish with dry run options, but announce with forReal).To solve this, we'll extend PublishOptions with an announcer field, which lets us collapse the code to the following:forRealOptions :: PublishOptions forRealOptions = ... dryRunOptions :: PublishOptions dryRunOptions = ... publish :: PublishOptions -> IO () publish options = unsafePreparePackage options >>= announcer optionsNow a user can call the function like so:publish forRealOptions publish dryRunOptionsThere are no conditionals and no booleans, just functions. You never need to decide what bits mean, and you can't get them wrong.Now that the control has been inverted, the caller of the publish function has the ultimate power and can decide what happens deep in the program merely by passing the right functions in.If you don't have any trouble reasoning about code that overflows with booleans and conditionals, then more power to you!For the rest of us, however, conditionals complicate reasoning. Fortunately, just like OOP has a technique for getting rid of conditionals, we functional programmers have an alternate technique that's just as powerful.By replacing conditionals with lambdas, we can invert control and make our code both easier to reason about and more generic.So what are you waiting for? Go sign the Anti-IF Campaign today. Read more

Megaupload coming back? Founder Kim Dotcom plans a relaunch in 2017

Flamboyant German tech entrepreneur Kim Dotcom is planning to relaunch file-sharing website Megaupload in January 2017, five years after the U.S. government took down the site accusing it of piracy.Megaupload, founded in 2005, had boasted of having more than 150 million registered users and 50 million daily visitors. At one point, it was estimated to be the 13th most frequently visited website on the internet.Dotcom, who announced his plans in a series of tweets on Friday, said most of the Megaupload users would get their accounts reinstated with premium privileges.He also hinted and that the new website will use bitcoins. ( did not immediately respond to a mail seeking comment.Dotcom and three others were arrested on Jan. 20, 2012, after armed New Zealand police raided his country estate at the request of the U.S. Federal Bureau of Investigation. U.S. authorities had said Dotcom and three other Megaupload executives cost film studios and record companies more than $500 million and generated more than $175 million by encouraging paying users to store and share copyrighted material, such as movies and TV shows. (, who has New Zealand residency, has denied charges of internet piracy and money laundering and has been fighting extradition to the United States.He has contended that the website was merely a storage facility for online files and should not be held accountable if stored content was obtained illegally. A New Zealand court in 2013 granted Dotcom access to all evidence seized by police in the raid of his house. ( Kim Dotcom's net worth was not known, he became well known for his lavish lifestyle as much as his computer skills.He used to post photographs of himself with cars having vanity plates such as "GOD" and "GUILTY", shooting an assault rifle and flying around the world in his private jet. The U.S. Federal Bureau of Investigation estimated in 2012 that Dotcom personally made around $115,000 a day during 2010.The assets seized earlier included nearly 20 luxury cars, one of them a pink Cadillac, works of art, and NZ$10 million invested in local finance companies. ("I'll be the first tech billionaire who got indicted, lost everything and created another billion $ tech company while on bail," he tweeted on Sunday. (Reporting by Supantha Mukherjee in Bengaluru; Editing by Shounak Dasgupta) Read more

Samsung Electronics set for best quarter in over two years on second-quarter smartphone boost

SEOUL Tech giant Samsung Electronics Co Ltd is poised to issue guidance for its best quarterly profit in more than two years, propelled by a surge in mobile earnings on the back of robust sales of its flagship Galaxy S7 smartphones. The South Korean giant will disclose its estimates for second-quarter earnings on Thursday, with analysts predicting a strong mobile division contributed to a 13 percent jump in operating profit from the same period a year earlier. The average forecast from a Thomson Reuters survey of 16 analysts tips Samsung to report April-June operating profit of 7.8 trillion won ($6.8 billion), the highest since an 8.5 trillion won profit in January-March of 2014. The mobile division of the world's top maker of smartphones and memory chips was likely its top earner for the second straight quarter with a 4.3 trillion won profit, according to the survey. Samsung surprised many with better-than-expected first-quarter earnings, and issued guidance for a further pickup in April-June."Galaxy S7 sales are better than expected in the first half, and the semiconductor business is also outperforming rivals," said KTB Asset Management's Lee Jin-woo. The fund manager estimated the firm's quarterly operating profit would also stay strong in both the third and fourth quarters at between 7 trillion won and 8 trillion won in each. Samsung's smartphone business had been squeezed before the start of this year between Apple Inc, at the high end of the market, and Chinese rivals like Huawei Technologies [HWT.UL] in the budget segment. But the Galaxy S7 has provided a catalyst for the earnings rebound, likely putting the mobile business on track to record its first annual profit growth in three years.Some analysts say Samsung shipped around 16 million Galaxy S7s in April-June, with a higher-priced curved-screen version outselling its flat-screen counterpart and boosting margins. Lackluster sales of offerings from rivals such as Apple and LG Electronics also helped reduced marketing expenses, they said. "While operating profit margins for the mobile phone business will decline in the third and fourth quarters as the Galaxy S7 effect fades, operating profit will continue to grow on an annual basis," Korea Investment & Securities said in a report. As its smartphones thrive, Samsung's chip business - last year's key profit driver - probably saw quarterly profit sink to its lowest in nearly two years due to weak demand from makers of other smartphones and personal computers. But signs of some price recovery for DRAM chips starting last month and Samsung's dominance in the premium solid-state disc drive market with its 3D NAND chip production technology suggest a pickup in coming months, analysts said. (Reporting by Se Young Lee; Editing by Tony Munroe and Kenneth Maxwell) Read more

Solar plane lands in Spain after three-day Atlantic crossing

SEVILLE, Spain An airplane powered solely by the sun landed safely in Seville in Spain early on Thursday after an almost three-day flight across the Atlantic from New York in one of the longest legs of the first ever fuel-less flight around the world.The single-seat Solar Impulse 2 touched down shortly after 7.30 a.m. local time in Seville after leaving John F. Kennedy International Airport at about 2.30 a.m. EDT on June 20.The flight of just over 71 hours was the 15th leg of the round-the-world journey by the plane piloted in turns by Swiss aviators Bertrand Piccard and Andre Borschberg. "Oh-la-la, absolutely perfect," Piccard said after landing, thanking his engineering crew for their efforts. With a cruising speed of around 70 kilometers an hour (43 miles per hour), similar to an average car, the plane has more than 17,0000 solar cells built in to wings with a span bigger than that of a Boeing 747. (Reporting by Marcelo Pozo; Writing by Paul Day; Editing by Gopakumar Warrier) Read more

Older Post