Showing posts with label Architecture. Show all posts
Showing posts with label Architecture. Show all posts

Tuesday, December 1, 2015

Detailed Exception class



   The C# StackTrace class can be useful for logging the source of errors, but when your assembly is built in Release mode, you lose valuable information in the StackFrame, like the line number, the column number or the file name.

   Part of my error handling strategy involved setting an error string, and using StackTrace to log the function calling the setter and the location in the code the error occurred. Unfortionatly, as mentioned above, I was losing error information like line number, and that kind of information sure is nice to have. Thats why I invented the DetailedException class.

   In .NET 4.5, one can get caller information by the use of default value parameters tagged with an special attribute, namely CallerFilePathAttribute, CallerMemberNameAttribute, CallerLineNumberAttribute.


How about a code example:

 [Serializable]
 public class DetailedException : Exception
 {
  public int SourceLineNumber { get; private set; }
  public string SourceFilePath { get; private set; }
  public string SourceMemberName { get; private set; }
   
  public DetailedException(string message,
     [CallerMemberName] string sourceMemberName = "",
     [CallerFilePath] string sourceFilePath = "",
     [CallerLineNumber] int sourceLineNumber = 0)
   : base(message)
  {
   this.SourceMemberName = sourceMemberName;
   this.SourceFilePath = sourceFilePath;
   this.SourceLineNumber = sourceLineNumber;
  }



   Now if you have to throw an exception, throw new DetailedException("Testing DetailedException. WOW. SUCH DETAILS."); and you will gain information like SourceLineNumber!

   If you decide to overload the constructor, be warned: You will be required to use named parameters when calling the DetailedException constructor

A Simple Word Prediction Library




   The word prediction feature on our phones are pretty handy and I've always and thought it would be fun to write one, and last night I decided to check that off my list. As usual, the whole project and all of its code is available to browse on GitHub. I talk more about the library and the design choices I made below the obnoxiously long image:


[Image of Windows Phone's Word Prediction feature]

   Visit the project and view the code on my GitHub, right here.
      (Project released under Creative Commons)

Overview:

   One thing you might notice, if for no other reason than I bring it up, is that I favor composition over inheritance. That is, my classes use a Dictionary internally, but they do not inherit from Dictionary. My word prediction library is not a minor variation or different flavor of the Dictionary class, and while it might be cool to access the word predictions for a word via an indexer, my word prediction library should not be treated as a dictionary.

Under the hood:

   There is a dictionary (a list of key/value pairs) of 'Word' objects. Each Word class has a value (the word), and its own dictionary of Word objects implemented as its own separate class (that does not inherit from Dictionary). This hidden dictionary inside each Word class keeps track of the probabilities of the the next word, for that given word. It does so by storing a Word as the key, and an integer counter value that gets incremented every time it tries to add a word to the dictionary that already exists (similar to my frequency dictionary, covered here).
The WordPredictionDictionary class doesn't grow exponentially, because each word is only represented once, by one Word class. The dictionaries inside the Word class only stores the references to the Word objects, not a copy of their values.
In order to begin using the WordPredictionDictionary to suggest words, one must train the WordPredictionDictionary on a representative body of text.

TODO:

  • Write methods to serialize the trained data sets so they can be saved and reloaded. This has been implemented.
  • Write an intelli-sense-like word suggestion program that implements the WordPredictionDictionary in an end-user application.

Tuesday, September 22, 2015

Threaded Equation Finder



Threaded Equation Finder

Find arithmetic equations that equates to a given 'target' value, number of terms, and operators.

Introduction

   You should all be familiar with how a typical computer works; you give it some variables, describe some quantities of some resources you have, choose an algorithm, let it process, and it returns to you a result or outcome. Now imagine a computer if you could work with a computer that worked the other way around. I believe it was Douglas Adams that described the notion of an all-together different type of computer; That is, you tell the computer what you want the outcome to be, and it goes off figuring out how to get there and what you need to do it. Z3, the Theorem Prover, and the constraint satisfaction problem (CSP) solver (and probably others) in Microsoft's Solver Foundation do almost exactly that.
   There is also the idea of Backcasting, which is a similar, but different idea.

   My program isn't as fancy as all that, but it does find equations that equates to a given 'target' value, albeit at random. You define constraints other than just the target value, such as what operators are allowed in the equation, the quantity of terms there, and the range or set of allowed terms.
For example, how many different ways can 9 nines equal 27, using only addition, subtraction, and multiplication, if evaluated left-to-right (ignore the order of operations)? Turns out there are only 67 ways.

(above) Application Screen Shot

How it works

   The actual approach for finding equations that equate to an arbitrary target or 'goal' value is rather primitive. By way of Brute Force, several threads are launched asynchronously to generate thousands of random equations, evaluate them and keep the results that have not been found yet.

   This is something I wrote approx. 2 years ago. I dug it up and decided to publish it, because I thought it was interesting. As this was an 'experiment', I created different ways of storing and evaluating the expressions, and have made those different 'strategies' conform to a common interface so I could easily swap them out to compare the different strategies. I have refactored the code so that each class that implements IEquation is in its own project and creates its own assembly.

   There are two fully-working strategies for representing equations: one that represented the equation as a list of 2-tuples (Term,Operator), did not perform order of operations, and was rather obtuse. The other strategy was to store the equation as a string and evaluate it using MSScriptControl.ScriptControl to Eval the string as a line of VBScript. This was unsurprisingly slower but allowed for much more robust equation evaluation. Order of operations is respected with the ScriptControl strategy, and opens the way to using using parenthesis.

   The other idea for a strategy which I have not implemented but would like to, would be a left-recursive Linq.Expression builder. Also, maybe I could somehow use Microsoft Solver Foundation for a wiser equation generation strategy than at random.



Limitations

   Today, however, there are better architectures. A concurrent system like this would benefit greatly from the Actor model. If you wanted to write complex queries against the stream of equations being generated or selected or solved, maybe reactive extensions would be a slam dunk.

   Although this project certainly is no Z3, it does provide an example of an interface... perhaps even a good one.



Running on Raspberry Pi 2

   Microsoft's Managed Extensibility Framework (MEF) might be a good thing here, but I also wrote a console client that is designed to be ran with Mono on the Raspberry Pi 2. MEF is a proprietary Microsoft .NET technology that is not supported in Mono. The extra meta data in the assembly shouldn't be a problem, but having a dependency on the MEF assembly will be. Probing of the runtime environment and dynamically loading of assemblies is required here, which I have not had time to do, so at this time, there is no MEF.

   The reason the mono client is a console application is because mono and winforms on the Raspberry Pi 2 fails for some people. The problem has something to do with a hardware floating point vs a software float, and it happens to manifest itself when using a TextBox control. The only thing that I haven't tried, and that should presumably fix it, is to re-build mono from the latest source.



Saturday, June 20, 2015

Lazy IEnumerable file reader class: When to inherit from IDispose?




    Need to read just the few couple lines from a gigantic data file? Or maybe you need a forward-only, load-only-what-you-need file reader pattern? Here is a construct I have been toying with, its a class that treats a file stream like an IEnumerable.

    Note added 8/1/15: TODO: Add constructor overload that accepts the starting filepointer position, optional ending filepointer position.

    This has the benefit of not using any resources if you never use it, allows you to incrementally read gigantic files without loading it all into memory (you might need to call the StreamReader.DiscardBufferedData() method every once in a while), and because its IEnumerable, you can write queries against it that are lazy; they don't actually execute until the run-time actually NEEDS it, such as calling the IEnumerable.ToList() or 'Count()' extensions, for example. Be careful with ToList() if the file is a gigabyte or more, as calling ToList() will cause the whole thing to be read right then.

    If instead you just need to iterate through each line only until you find what you are looking for, or use Linq and a predicate to search for a particular line that satisfies a condition, then this pattern will save your application from having the load the whole thing in memory:

public class EnumerableFileReader
{
    public FileInfo File { get { return _file; } }
    public bool FileExists { get { return _file.Exists; } }

    private FileInfo _file;

    public EnumerableFileReader(string fileLocation)
        : this(new FileInfo(fileLocation))
    {
    }

    public EnumerableFileReader(FileInfo file)
    {
        if (!file.Exists)
        {
            throw new FileNotFoundException();
        }

        _file = file;
    }

    public IEnumerable FileLines()
    {
        if (!FileExists) yield break;

        string line;
        //long internalBufferSize = 0;

        using (StreamReader reader = _file.OpenText())
        {
            while ((line = reader.ReadLine()) != null)
            {
                //if (internalBufferSize++ > 90000) {   reader.DiscardBufferedData(); internalBufferSize = 0; }
                yield return line;
            }
        }

        yield break;
    }
}

    It struck me that it might be a good idea to make the class inherit from IDisposable, so the StreamReader doesn't get left around in memory, holding a file open. Indeed; all those yield keywords make it look like the stream object will just hang around there if FileLines is never called again to finish the enumeration. However, it turns out this is probably not necessary but the answer is, as you might expect: IT DEPENDS. It depends... on how you are going to use the class. Looking into the subject, I discovered that when you use the yield keyword, the compiler generates a nested class which implements the IEnumerable, IEnumerator and IDisposable interfaces and stores all context data for you under the hood. I'm not going to drop the IL (or CIL) here, but if you are curious, just open up your IEnumerable class in ILSpy. Just make sure you change the language in the drop-down box at the top from C# to IL, otherwise it will be hidden.

    So just when is our class disposed of? Well anytime you explicitly call Dispose on the Enumerator or the Stream, which one might expect. However, this will dispose of a lot more than just the stream or the enumerator alone, but all of the associated constructs that are generated around the enumerator/yield pattern. to be disposed of. Dispose will also be called at the end of enumeration. This includes when you run out of things to enumerate, any time you use the yield break or return keyword, or the program flow leaves the using statement surrounding the Stream. Here is something I didn't know: Dispose is also called when you call the IEnumerable.First() or FirstOrDefault() extension. Also, any time you use a foreach loop around the IEnumerator, the object will get disposed after you are done looping or after you break or leave the loop.

  
So, in short: As long as you're using LINQ extensions or foreach loops, the disposal is taken care of for you automatically. However, if you are manually calling the Enumerator().MoveNext() on the method, then you need to call dispose yourself, or rather, implement the IDisposable pattern.

    Being able to use EnumerableFileReader in a using statement/disposable pattern would likely be expected of a file reader class. You could have your dispose method set a boolean flag and then call FileLines(), and add an if statement in the while look of your FileLines() method that will yield break when the dispose flag is set to true, but cleaning up properly can be tricky if your IEnumerator has more than one or two return yield statements. I would instead suggest that that we use one of the tricks we just leaned above and just have our Dispose() function call .FirstOrDefault() on the FileLines() method:


public class EnumerableFileReader : IDisposable
{
[...]

    public void Dispose()
    {
        FileLines().FirstOrDefault();
    }

[...]
}


Saturday, April 18, 2015

Hindenbugs, Heisenbugs and other types of software bugs




Different Types of Software Bugs


    Sorry, I haven't gotten around to creating a new post in close to a month (yikes!). I have just been super busy at work and have been trying to complete about 3 out of several not-quite-finished personal projects that I will be releasing on GitHub and blogging about

Heisenbug--
    1) A software bug that disappears or alters its behavior by the action of attempting to debug it.
    Possible sources: The effect of debuggers on the run-time code or JIT can be one source. Another might arise because of process scheduling or threading and race conditions. Or both, as stepping through the code with a debugger will affect the timing of threads.
    Derivation: Werner Heisenberg and his Uncertainty Principle.


Mandelbug--
    1) A bug whose causes are so complex it defies repair, or makes its behavior appear chaotic or even non-deterministic. Other symptoms include core dumps that, when viewed in an editor, form complex but repeating patterns or designs of characters and symbols.
    Possible sources: Operating system environment or configuration, the presence or absence of file system objects, time or scheduling-dependent code. Also, a very complicated state machine esp. one with more than one variable that hold state information and where transitions may not exists for every possible permutation of one state to another.
    Derivation: Benoît Mandelbrot and his research in fractal geometry.


Schrödinbug--
    1) A bug that manifests itself in running software after a programmer notices that the code should never have worked in the first place.
    Possible sources: A combination of naughty coding practices such as coding by coincidence and something else that allows for unexpected behavior such as a the effect of custom configurations or compiler optimizations, code obfuscation, or the JIT or on code. Also, a dependency on some arcane piece of technology that no one truly understands or employment of a black-box system or library, or some ancient, dinosaur system with a highly specific configuration that is so fragile that no-one dares update it because said updates could likely or is known to break functionality and where even a reboot is considered hairy.
    Derivation: Named for Erwin Schrödinger and his thought experiment.


Bohrbug--
    1) In contrast to the Heisenbug, the Bohrbug, is a "good, solid bug", easy to hunt down, or easily predicted from the description, esp. when a bug is the result of a classical programming mistake, or 'a rookie move'.
    Possible sources: Failing to initialize a variable or class to anything other than null, failure to RTFM, misuse of pointers or general abuse of memory, not coding defensively, coding by coincidence or not truly understanding how (or why) your code works.
    Derivation: Named after the physicist Niels Bohr and his rather simple atomic model.


Hindenbug--
    1) A bug with catastrophic consequences, esp. one that actually causes the server to burst into bright, hot flames.
    Possible sources: Code that dynamically builds and executes SQL scripts, esp. scripts that make use of the DELETE FROM command, code that is self modifying or replicates, or code that controls large machines or a physical system such as pumps, belts, cooling systems, aggregate pulverizer, fans, or any sort of thing with large, rotating blades. Also, any software that controls, monitors, tests, or is in any way whatsoever, wired up to a tactical nuclear defense system. In fact, lets just include anything with 'nuclear' or 'pulverize' in the name.
    Derivation: Refers to the Hindenburg disaster. The Hindenburg Blimp was, in turn, named after Paul von Hindenburg, the then-president of Germany from 1925->1934.



    Kishor S. Trivedi has a great talk/slideshow about Software Faults, Failures and Their Mitigations. He posits that it is not realistic to write software that is 100% bug free, or have 100% up-time. He displays the downtime in terms of minutes per year (MPY?) of several reliable corporations to support his claims, and it is true that as far as I am aware,there have not been any software ever written that did not have some downtime, even NASA.
    However, Trivedi shares a quote by one E. W. Dijkstra: "Testing shows the presence, not the absence, of bugs". Aww, logic. Now that's something I can get behind. Indeed; It is impossible to provably show that any software (of sufficient complexity) is absolutely error-free.
    Following this, he suggests we should not strive for the virtually impossible task building fault-free software, but rather aim to build instead fault-tolerant software.



Software Faults -> Software Errors -> Software Failures

Software Faults lead to Software Errors that lead to Software Failures.

Trivedi defines faults, errors and failures:
      -Software Failure occurs when the delivered service no longer complies with the desired output.
      -Software Error is that part of the system state which is liable to lead to subsequent failure.
      -Software Fault is adjudged or hypothesized cause of an error.Faults are the cause of errors that may lead to failures



    So, how does one test their tactical nuclear defense system code? VERY carefully, and it probably wouldn't hurt to comment out the Launch() method as well.

Wednesday, February 18, 2015

The Feature Flags Pattern





I was listening to Episode 1101 of the podcast Dot Net Rocks. Jez Humble was talking about the concept of Feature Toggles or Feature Flags. Feature Switches? While the term has some opinionated definitions, the concept that I found most interesting was the idea of deploying software with the new features initially disabled, or switched off by some mechanism. After you think the feature is ready for production, switch on the feature. If there is an issue, you don't have to roll back the version or deploy another release, just switch the feature back off again and replace its .dll. Didn't get enough debug information? Switch the feature back on and let a few crash reports trickle in and then switch it back off. If you feature is particularly processor or network intensive, you can perform load testing by slowly releasing the features to select clients or only part of the population/userbase.

While I personally choose an SQL table for my approach to storing the toggle switches (internal business app), one could use the application's .config file. An application could pull the settings from a .config file on a networked drive as a way of controlling multiple application instances by making one changed to a centralized location. Below I show a minimalist implementation by creating a Dictionary from the <appSettings> in a App.config file.

Behold:


public static Dictionary<string, bool> GetFeatureFlags()
{
  return ConfigurationManager.AppSettings.AllKeys.ToDictionary(s => s, IsFlagSet);
}

public static bool IsFlagSet(string settingName)
{
  bool result = false;
  bool.TryParse(ConfigurationManager.AppSettings[settingName], out result);
  return result;
}

Of course with a dictionary you have to be careful that you don't try an access a column that does not exist with the indexer, so you might be better off using IsFlagSet(string), which will never throw. Although this is of limited use (AppSettings is already a NameValueCollecion), perhaps you can make use of this generic function I wrote that uses generics to convert AppSettings into a dictionary with the type of the value being the generic:


public static Dictionary<string, T> GetDictionary<T>()
{
 return ConfigurationManager.AppSettings.AllKeys.ToDictionary<string, string, T>(s => s, GetSetting<T>);
}

public static T GetSetting<T>(string settingName)
{
  try
  {
    T result = (T) Convert.ChangeType(
      ConfigurationManager.AppSettings[settingName],
      typeof (T)
    );
    if (result != null)
    {
      return result;
    }
  }
  catch { }

  return default(T);
}

Please note that swallowing an exception ("catch { }") is typically considered poor form. In this particular scenario, I am aware of the possible exceptions that can be thrown by this code and I want the code to return the default(T) in this scenario and never throw.



Sunday, June 23, 2013

Gracefull error handling with a global exception handler




Every published C# application should have graceful error handling. Here I show you the implementation of a global exception handler using ThreadExceptionEventHandler.

First, you have to add System.Threading to both your Program.cs and Mainform.cs:
// Program.cs and Mainform.cs
using System.Threading;

Then add an event handler to Application.ThreadException:
// Program.cs
// static class Program {
//  private static void Main(string[] args) {
Application.ThreadException += new ThreadExceptionEventHandler(MainForm.MyExceptionHandler);
// Application.Run(new MainForm());

Or, if you are writing a console app, add an event handler to AppDomain.UnhandledException:
AppDomain.CurrentDomain.UnhandledException += new UnhandledExceptionEventHandler(MyExceptionHandler);

Then add the exception handler body:
// Mainform.cs
public static void MyExceptionHandler(object sender, ThreadExceptionEventArgs e)
{
   MessageBox.Show(e.Exception.Message,"Error",MessageBoxButtons.OK,MessageBoxIcon.Error);
}

The example here simply shows a message box, but an even more graceful approach would be to log all the unhanded exceptions to a log file. That way, the errors are transparent to the user, but the developer still has access to detailed debug information.

Here is the full code:
Program.cs
MainForm.cs