C# Programming Tips and Examples: .net

Showing posts with label .net. Show all posts

Thursday, December 24, 2015

Infix Notation Parser via Shunting-Yard Algorithm

Infix notation is the typical notation for writing equations in algebra.
An example would be: 7 - (2 * 5)

Parsing such an equation is not a trivial task, but I wanted one for my EquationFinder project, as I wanted to respect order of operations.

Strategies include substitution/replacement algorithms, recursion to parse into a tree and then tree traversal, or converting the infix notation to reverse polish notation (RPN), also known as post-fix notation, then using a stack based postfix notation evaluator. I choose the latter, as such algorithms are well defined in many places on the web.

My code consists of 3 classes, all static:
(Links go to the .cs file on GitHub)

InfixNotation - this simply holds a few public variables and calls the public methods on the below two classes.
ShuntingYardAlgorithm - this converts an equation in infix notation into postfix notation (aka RPN).
PostfixNotation - this evaluates the equation in postfix notation and returns a numerical result value.

In order to implement the shunting-yard algorithm and the postfix evaluator, I simply wrote the steps to the algorithms as written on Wikipedia:
(Links go to the Wikipedia article)
Link to the Shunting-Yard Algorithm to convert Infix notation to Postfix notation.
Link to the Postfix Notation Evaluation Algorithm.

The code for this is pretty extensive, but I will prettify it and present it below. Alternatively, you can view and download the code from the MathNotationConverter project on my GitHub.

InfixNotationParser:


public static class InfixNotation
{
   public static string Numbers = "0123456789";
   public static string Operators = "+-*/^";

   public static bool IsNumeric(string text)
   {
      return string.IsNullOrWhiteSpace(text) ? false : text.All(c => Numbers.Contains(c));
   }

  public static int Evaluate(string infixNotationString)
  {
    string postFixNotationString = ShuntingYardConverter.Convert(infixNotationString);
    int result = PostfixNotation.Evaluate(postFixNotationString);
    return result;
  }
}

ShuntingYardConverter

(converts an equation from infix notation into postfix notation):


public static class ShuntingYardAlgorithm
{
   private static string AllowedCharacters = InfixNotation.Numbers + InfixNotation.Operators + "()";

   private enum Associativity
   {
      Left, Right
   }
   private static Dictionary<char, int> PrecedenceDictionary = new Dictionary<char, int>()
   {
      {'(', 0}, {')', 0},
      {'+', 1}, {'-', 1},
      {'*', 2}, {'/', 2},
      {'^', 3}
   };
   private static Dictionary<char, Associativity> AssociativityDictionary = new Dictionary<char, Associativity>()
   {
      {'+', Associativity.Left},
      {'-', Associativity.Left},
      {'*', Associativity.Left},
      {'/', Associativity.Left},
      {'^', Associativity.Right}
   };

   private static void AddToOutput(List<char> output, params char[] chars)
   {
      if (chars != null && chars.Length > 0)
      {
         foreach (char c in chars)
         {
            output.Add(c);
         }
         output.Add(' ');
      }
   }
   
   public static string Convert(string infixNotationString)
   {
      if (string.IsNullOrWhiteSpace(infixNotationString))
      {
         throw new ArgumentException("Argument infixNotationString must not be null, empty or whitespace.", "infixNotationString");
      }

      List<char> output = new List<char>();
      Stack<char> operatorStack = new Stack<char>();
      string sanitizedString = new string(infixNotationString.Where(c => AllowedCharacters.Contains(c)).ToArray());

      string number = string.Empty;
      List<string> enumerableInfixTokens = new List<string>();
      foreach (char c in sanitizedString)
      {
         if (InfixNotation.Operators.Contains(c) || "()".Contains(c))
         {
            if (number.Length > 0)
            {
               enumerableInfixTokens.Add(number);
               number = string.Empty;
            }
            enumerableInfixTokens.Add(c.ToString());
         }
         else if (InfixNotation.Numbers.Contains(c))
         {
            number += c.ToString();
         }
         else
         {
            throw new Exception(string.Format("Unexpected character '{0}'.", c));
         }
      }

      if (number.Length > 0)
      {
         enumerableInfixTokens.Add(number);
         number = string.Empty;
      }

      foreach (string token in enumerableInfixTokens)
      {
         if (InfixNotation.IsNumeric(token))
         {
            AddToOutput(output, token.ToArray());
         }
         else if (token.Length == 1)
         {
            char c = token[0];

            if (InfixNotation.Numbers.Contains(c)) // Numbers (operands)
            {
               AddToOutput(output, c);
            }
            else if (InfixNotation.Operators.Contains(c)) // Operators
               if (operatorStack.Count > 0)
               {
                  char o = operatorStack.Peek();
                  if ((AssociativityDictionary[c] == Associativity.Left &&
                     PrecedenceDictionary[c] <= PrecedenceDictionary[o])
                        ||
                     (AssociativityDictionary[c] == Associativity.Right &&
                     PrecedenceDictionary[c] < PrecedenceDictionary[o]))
                  {
                     AddToOutput(output, operatorStack.Pop());
                  }
               }
               operatorStack.Push(c);
            }
            else if (c == '(') // open brace
            {
               operatorStack.Push(c);
            }
            else if (c == ')') // close brace
            {
               bool leftParenthesisFound = false;
               while (operatorStack.Count > 0 )
               {
                  char o = operatorStack.Peek();
                  if (o != '(')
                  {
                     AddToOutput(output, operatorStack.Pop());
                  }
                  else
                  {
                     operatorStack.Pop();
                     leftParenthesisFound = true;
                     break;
                  }
               }

               if (!leftParenthesisFound)
               {
                  throw new FormatException("The algebraic string contains mismatched parentheses (missing a left parenthesis).");
               }
            }
            else // wtf?
            {
               throw new Exception(string.Format("Unrecognized character '{0}'.", c));
            }
         }
         else
         {
            throw new Exception(string.Format("String '{0}' is not numeric and has a length greater than 1.", token));
         }
      } // end foreach

      while (operatorStack.Count > 0)
      {
         char o = operatorStack.Pop();
         if (o == '(')
         {
            throw new FormatException("The algebraic string contains mismatched parentheses (extra left parenthesis).");
         }
         else if (o == ')')
         {
            throw new FormatException("The algebraic string contains mismatched parentheses (extra right parenthesis).");
         }
         else
         {
            AddToOutput(output, o);
         }
      }

      return new string(output.ToArray());
   }
}

PostfixNotation

(evaluates the postfix notation and returns a numerical result):


public static class PostfixNotation
{
   private static string AllowedCharacters = InfixNotation.Numbers + InfixNotation.Operators + " ";

   public static int Evaluate(string postfixNotationString)
   {
      if (string.IsNullOrWhiteSpace(postfixNotationString))
      {
         throw new ArgumentException("Argument postfixNotationString must not be null, empty or whitespace.", "postfixNotationString");
      }

      Stack<string> stack = new Stack<string>();
      string sanitizedString = new string(postfixNotationString.Where(c => AllowedCharacters.Contains(c)).ToArray());
      List<string> enumerablePostfixTokens = sanitizedString.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).ToList();

      foreach (string token in enumerablePostfixTokens)
      {
         if (token.Length > 0)
         {
            if (token.Length > 1)
            {
               if (InfixNotation.IsNumeric(token))
               {
                  stack.Push(token);
               }
               else
               {
                  throw new Exception("Operators and operands must be separated by a space.");
               }
            }
            else
            {
               char tokenChar = token[0];

               if (InfixNotation.Numbers.Contains(tokenChar))
               {
                  stack.Push(tokenChar.ToString());
               }
               else if (InfixNotation.Operators.Contains(tokenChar))
               {
                  if (stack.Count < 2)
                  {
                     throw new FormatException("The algebraic string has not sufficient values in the expression for the number of operators.");
                  }

                  string r = stack.Pop();
                  string l = stack.Pop();

                  int rhs = int.MinValue;
                  int lhs = int.MinValue;

                  bool parseSuccess = int.TryParse(r, out rhs);
                  parseSuccess &= int.TryParse(l, out lhs);
                  parseSuccess &= (rhs != int.MinValue && lhs != int.MinValue);

                  if (!parseSuccess)
                  {
                     throw new Exception("Unable to parse valueStack characters to Int32.");
                  }

                  int value = int.MinValue;
                  if (tokenChar == '+')
                  {
                     value = lhs + rhs;
                  }
                  else if (tokenChar == '-')
                  {
                     value = lhs - rhs;
                  }
                  else if (tokenChar == '*')
                  {
                     value = lhs * rhs;
                  }
                  else if (tokenChar == '/')
                  {
                     value = lhs / rhs;
                  }
                  else if (tokenChar == '^')
                  {
                     value = (int)Math.Pow(lhs, rhs);
                  }

                  if (value != int.MinValue)
                  {
                     stack.Push(value.ToString());
                  }
                  else
                  {
                     throw new Exception("Value never got set.");
                  }
               }
               else
               {
                  throw new Exception(string.Format("Unrecognized character '{0}'.", tokenChar));
               }
            }
         }
         else
         {
            throw new Exception("Token length is less than one.");
         }
      }

      if (stack.Count == 1)
      {
         int result = 0;
         if (!int.TryParse(stack.Pop(), out result))
         {
            throw new Exception("Last value on stack could not be parsed into an integer.");
         }
         else
         {
            return result;
         }
      }
      else
      {
         throw new Exception("The input has too many values for the number of operators.");
      }

   } // method
} // class

Another alternative technique is to using the Shunting-Yard Algorithm to turn infix notation into an abstract syntax tree (Linq.Expressions anyone?). I will likely post this technique later.

Other blog posts by me that are related to this article are the Threaded Equation Finder, a Mixed Radix System Calulator and Drawing Text Along a Bezier Spline.

Tuesday, December 1, 2015

Detailed Exception class

The C# StackTrace class can be useful for logging the source of errors, but when your assembly is built in Release mode, you lose valuable information in the StackFrame, like the line number, the column number or the file name.

Part of my error handling strategy involved setting an error string, and using StackTrace to log the function calling the setter and the location in the code the error occurred. Unfortionatly, as mentioned above, I was losing error information like line number, and that kind of information sure is nice to have. Thats why I invented the DetailedException class.

In .NET 4.5, one can get caller information by the use of default value parameters tagged with an special attribute, namely CallerFilePathAttribute, CallerMemberNameAttribute, CallerLineNumberAttribute.

How about a code example:

 [Serializable]
 public class DetailedException : Exception
 {
  public int SourceLineNumber { get; private set; }
  public string SourceFilePath { get; private set; }
  public string SourceMemberName { get; private set; }
   
  public DetailedException(string message,
     [CallerMemberName] string sourceMemberName = "",
     [CallerFilePath] string sourceFilePath = "",
     [CallerLineNumber] int sourceLineNumber = 0)
   : base(message)
  {
   this.SourceMemberName = sourceMemberName;
   this.SourceFilePath = sourceFilePath;
   this.SourceLineNumber = sourceLineNumber;
  }

Now if you have to throw an exception, throw new DetailedException("Testing DetailedException. WOW. SUCH DETAILS."); and you will gain information like SourceLineNumber!

If you decide to overload the constructor, be warned: You will be required to use named parameters when calling the DetailedException constructor

A Simple Word Prediction Library

The word prediction feature on our phones are pretty handy and I've always and thought it would be fun to write one, and last night I decided to check that off my list. As usual, the whole project and all of its code is available to browse on GitHub. I talk more about the library and the design choices I made below the obnoxiously long image:

[Image of Windows Phone's Word Prediction feature]

Visit the project and view the code on my GitHub, right here.
(Project released under Creative Commons)

Overview:

One thing you might notice, if for no other reason than I bring it up, is that I favor composition over inheritance. That is, my classes use a Dictionary internally, but they do not inherit from Dictionary. My word prediction library is not a minor variation or different flavor of the Dictionary class, and while it might be cool to access the word predictions for a word via an indexer, my word prediction library should not be treated as a dictionary.

Under the hood:

There is a dictionary (a list of key/value pairs) of 'Word' objects. Each Word class has a value (the word), and its own dictionary of Word objects implemented as its own separate class (that does not inherit from Dictionary). This hidden dictionary inside each Word class keeps track of the probabilities of the the next word, for that given word. It does so by storing a Word as the key, and an integer counter value that gets incremented every time it tries to add a word to the dictionary that already exists (similar to my frequency dictionary, covered here).
The WordPredictionDictionary class doesn't grow exponentially, because each word is only represented once, by one Word class. The dictionaries inside the Word class only stores the references to the Word objects, not a copy of their values.
In order to begin using the WordPredictionDictionary to suggest words, one must train the WordPredictionDictionary on a representative body of text.

TODO:

~~Write methods to serialize the trained data sets so they can be saved and reloaded.~~ This has been implemented.
Write an intelli-sense-like word suggestion program that implements the WordPredictionDictionary in an end-user application.

Thursday, October 29, 2015

Thinq - A Linq Experiment

   View/Download the source code from the project's GitHub

   So I wrote a program, just an experiment, where I was making a range class using IEnumerables (C#), and each element doesn't have to increment by one, but any amount. so I was creating ranges like 7 to 10 million, increment by 7, so upon enumeration it would yield multiples of 7. This is also called arithmetic progression.

   Then I started combining different multiples with query operators like Where operator or Intersect like IEnumerable result = multiples7.Intersect(multiples13.MoveNext()), essentially creating a function that keeps only those numbers that are multiples of both 7 and 13, starting with the least common multiple.

   So I began testing. After some playing, I decided to take the first 7 primes, and find any common multiples to them between 1 and 10 million. Much to my surprise, it found all the common multiples of the first 7 prime numbers under 10 million (there are only two of them, 4849845 & 9699690), and it did it in 500 milliseconds on some very modest hardware (1 core, 2.16GHz, 4GB ram).

   I bumped up the ceiling to 50 million and I got an OutOfMemoryException because the IEnumerable holds on to every value it gets from the function MoveNext(). I threw in some metrics and discovered that it took about 3 seconds and some 32-million, 64-bit integers for my computer to declare 'out of memory'.

   Well, at least it was fast, even if it did eat up all my ram in 3 seconds, it was still promising.

   The solution was to create an IEnumerator that was aware of the arithmetic sequences that constrained the results set. When MoveNext() is called repeatedly during enumeration, I avoid the infinite memory requirement by restricting the result set returned from MoveNext(); it returns the next whole number that is divisible by every arithmetic sequence's 'common difference', or increment value. In this way, you have created a enumerable sequence that is the _intersection_ of all of the sequences.

   The enumerator is prevented from running to infinity by obeying two limits: A maximum numeric value (cardinal) that GetNext() will return to ("results less than 50 million") and a maximum quantity of results (ordinal) that GetNext() yields ("the one millionth result").    If either of these limits are exceeded, the while loop will fail to evaluate to true. It is very common for my processor-intensive, long running or 'mathy' applications to employ a temporal limit (maximum time-to-live) or support cancellation, but this little experiment has been so performant that I have been able to get by without one.

   So what kind of improvement did we get out of our custom enumerable? I can now find all the common factors for the first 8 prime numbers up to 1 billion in 25 seconds! I was impressed; the application used to max out around 50 million and run out of memory, and now it can investigate to one billion in a reasonable amount of time and the memory it uses is not much more than the 8 or so integers in the result set. 1 billion, however seems to be the sweet spot for my single 2.13 GHz laptop. I ran the same 8 primes to 2 billion and it took 1 minute, 12 seconds:


TIME ELAPSED: 01:12.38
LCM[3,5,7,11,13,17,19,23] (max 2,000,000,000)

17 FACTORS:

111546435 
223092870 
334639305 
446185740 
557732175 
669278610 
780825045 
892371480 
1003917915 
1115464350 
1227010785 
1338557220 
1450103655 
1561650090 
1673196525 
1784742960 
1896289395

Tuesday, October 6, 2015

Mixed Radix Numeral System class and Counter

Mixed Radix Calculator

My 'Mixed Radix Calculator' creates a counting system of radices (plural of radix), such as base 12 or mixed radices such as Minutes/Hours/Days/Years: 365:24:60:60. I choose the left side to be the most significant side. This is merely a personal preference, and my MixedRadixSystem class supports displaying both alignments.

Of course you dont have to choose a mixed radix numeral system, you can count in an N-base numeral system, such as base 7 or a more familiar base 16. Another feature lies in my RadixNumeral class. Each numeral, or place value, supports having its own dictionary of symbols.

- View the code at its GitHub - https://github.com/AdamWhiteHat/MixedRadixCalculator

(Project released under Creative Commons)

-  52:7:24:60:60:1000  -

  A numeral system (or system of numeration) is a writing system for expressing numbers.

  The most familiar one is of course the decimal numeral system. This is a 10-base numbering system. Computers use a binary numeral system. The base is sometimes called the radix or scale.

  Not all numbering systems have just one base. Take for example, how we currently divide time: There are 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, and 365 days in a year. This is called a mixed radix numeral system, and one might express the above mixed radix system like: 365:24:60:60.

  https://en.wikipedia.org/wiki/Mixed_radix
  http://mathworld.wolfram.com/Base.html

Uses:
  I haven't found a lot of use cases for it yet, but it is interesting. I originally built this because I wanted to experiment with numeral systems that uses increasing consecutive prime numbers for each radix, as well as experiment with some off-bases, such as base 3 or base 7.

  In a single base, say base 7, then 'round numbers' with only one place value having a 1 and the rest having zeros, such as 1:0:0:0:0 (in base 7), such numbers are powers of 7, and ever other number except for the 1's place value is a multiple of 7.

  A mixed radix numeral system can represent a polynomial, and possibly provide for a simpler way to visualize and reason about them.

  Yet another possible use is to make a numeral system with a base that is larger than and co-prime to some other target number (say 256) to make a bijective map from every value in a byte to some other value exactly once by repeatedly adding the value of the co-prime, modulus 256. This can appear rather random (or sometimes not at all) but the mapping is easily determined given the co-prime. I have talked about this notion before on my blog
  https://csharpcodewhisperer.blogspot.com/search/label/Coprime

  If you like this project you would probably like my project EquationFinder, it finds equations given constraints
  https://github.com/AdamWhiteHat/EquationFinder

Tuesday, September 22, 2015

Threaded Equation Finder

Find arithmetic equations that equates to a given 'target' value, number of terms, and operators.

View the GitHub - Download source .zip

Introduction

You should all be familiar with how a typical computer works; you give it some variables, describe some quantities of some resources you have, choose an algorithm, let it process, and it returns to you a result or outcome. Now imagine a computer if you could work with a computer that worked the other way around. I believe it was Douglas Adams that described the notion of an all-together different type of computer; That is, you tell the computer what you want the outcome to be, and it goes off figuring out how to get there and what you need to do it. Z3, the Theorem Prover, and the constraint satisfaction problem (CSP) solver (and probably others) in Microsoft's Solver Foundation do almost exactly that.
There is also the idea of Backcasting, which is a similar, but different idea.

My program isn't as fancy as all that, but it does find equations that equates to a given 'target' value, albeit at random. You define constraints other than just the target value, such as what operators are allowed in the equation, the quantity of terms there, and the range or set of allowed terms.
For example, how many different ways can 9 nines equal 27, using only addition, subtraction, and multiplication, if evaluated left-to-right (ignore the order of operations)? Turns out there are only 67 ways.

(above) Application Screen Shot

How it works

The actual approach for finding equations that equate to an arbitrary target or 'goal' value is rather primitive. By way of Brute Force, several threads are launched asynchronously to generate thousands of random equations, evaluate them and keep the results that have not been found yet.

This is something I wrote approx. 2 years ago. I dug it up and decided to publish it, because I thought it was interesting. As this was an 'experiment', I created different ways of storing and evaluating the expressions, and have made those different 'strategies' conform to a common interface so I could easily swap them out to compare the different strategies. I have refactored the code so that each class that implements IEquation is in its own project and creates its own assembly.

There are two fully-working strategies for representing equations: one that represented the equation as a list of 2-tuples (Term,Operator), did not perform order of operations, and was rather obtuse. The other strategy was to store the equation as a string and evaluate it using MSScriptControl.ScriptControl to Eval the string as a line of VBScript. This was unsurprisingly slower but allowed for much more robust equation evaluation. Order of operations is respected with the ScriptControl strategy, and opens the way to using using parenthesis.

The other idea for a strategy which I have not implemented but would like to, would be a left-recursive Linq.Expression builder. Also, maybe I could somehow use Microsoft Solver Foundation for a wiser equation generation strategy than at random.

Limitations

Today, however, there are better architectures. A concurrent system like this would benefit greatly from the Actor model. If you wanted to write complex queries against the stream of equations being generated or selected or solved, maybe reactive extensions would be a slam dunk.

Although this project certainly is no Z3, it does provide an example of an interface... perhaps even a good one.

Running on Raspberry Pi 2

Microsoft's Managed Extensibility Framework (MEF) might be a good thing here, but I also wrote a console client that is designed to be ran with Mono on the Raspberry Pi 2. MEF is a proprietary Microsoft .NET technology that is not supported in Mono. The extra meta data in the assembly shouldn't be a problem, but having a dependency on the MEF assembly will be. Probing of the runtime environment and dynamically loading of assemblies is required here, which I have not had time to do, so at this time, there is no MEF.

The reason the mono client is a console application is because mono and winforms on the Raspberry Pi 2 fails for some people. The problem has something to do with a hardware floating point vs a software float, and it happens to manifest itself when using a TextBox control. The only thing that I haven't tried, and that should presumably fix it, is to re-build mono from the latest source.

Sunday, August 2, 2015

Certificate Enumerator

Recently, my windows quit updating. Just prior to that, I had been messing around with my certificate store, so I suspected that to be the cause. Running Microsoft's troubleshooter reset the download, which I had a lot of hope of fixing the issue, but the download still continued to to fail. I decided to check the Windows Event Logs, and that's where I found an error message about a certificate in the chain failing. I knew it! However, I did not know whether a trusted certificate accidentally got put in the untrusted store, or whether an untrusted certificate was accidentally put in the trusted store. I needed a way to search all of my certificates' thumbprints or serial numbers against a know repository of trusted or untrusted certificates.
Microsoft's certificate snap-in for MMC does not allow you to view certificates in an efficient way. Opening them one at a time, manually, and then scrolling all the way down to where the thumbprint is displayed to compare it to a webpage is painful. Also, I was not satisfied with the way that Microsoft allows you to search the certificate store. The search is not very effective and you cant even search for thumbprints! Also I do not believe the search feature allows you anyway to copy any of that information to clipboard.
Most of what I needed to accomplish could simply be done if I could just export all of my computers certificates thumbprint or serial numbers to a text file, csv file, or other simple and searchable format. Then I thought to myself, I know how to do that! It was the great the lack of features of the MMC certificate snap-in, and the inability to search for certificate thumbprints that inspired me to write my own certificate utility, known simply as Certificate Enumerator.

CertificateEnumerator can list every certificate in your various certificate stores for your local machine and currently logged in user. It can then display that information to you either in a DataGridView or TextBox (as columnarized text), and provides the ability to persist that information to file as text, comma separated values (CSV), excel format or HTML table.

The Certificate Enumerator also has the ability to 'validate' each certificate against its CRL (certificate revocation list), if it supplied one.

The GUI could really use some love. In case you missed it, the project is on my GitHub, so feel free to download the source and play with it. If you come up with useful, submit a pull request.

Wednesday, July 29, 2015

Finding a date range in SQL

    At work, we use log4net, and we have the appender (or logging output location) set up to be AdoNetAppender, so thus it logs to a SQL database. In the Log table, there is a column called [Date], and it has the sql type of datetime.

    Often, when querying the Log table, you only want to view the most recent dates. lets say within the last week. You could always ORDER BY [Date] DESC of course, but suppose we wanted more control than that, such as only last week's dates.

    The SQL keywords (and functions) that are relevant here are BETWEEN, GETDATE and DATEADD.

    Here is the SQL code:


 SELECT
     [ID],[Date],[Thread],[Level],[Logger],[Message],[Exception]
 FROM
     [DatabaseName].[dbo].[Log]
 WHERE
     [Date] BETWEEN
      DATEADD(dd, -7, GETDATE())
      AND
      DATEADD(dd,  1, GETDATE())
 ORDER BY

     [Date] DESC

The BETWEEN keyword should be pretty self-explanatory, as should the GETDATE function. The secret sauce here lies within the DATEADD function.

The SQL function DATEADD has this signature: DATEADD (datepart, number, date)

The DATEADD function adds a number to a component of DATETIME, in this case, days. This number can be negative to subtract time from a DATETIME, as is the case with our example. The datepart parameter is what determines what component of the DATETIME we are adding to. You can add as much as a year, or as little as a nanosecond (what, no picoseconds? *laugh*). Microsoft's Transact-SQL MSDN page for DATEADD supplies the following table for datepart:

DATEPART	ABBREVIATIONS
year	yy, yyyy
quarter	qq, q
month	mm, m
dayofyear	dy, y
day	dd, d
week	wk, ww
weekday	dw, w
hour	hh
minute	mi, n
second	ss, s
millisecond	ms
microsecond	mcs
nanosecond	ns

In the example, I am subtracting 7 days from the current date. If you are making a stored procedure, this variable can be replaced with a parameter:


 CREATE PROCEDURE [dbo].[sp_GetLogEntriesRange]
     @RangeInDays int
 AS
 BEGIN
     DECLARE @DaysToAdd int
     SET @DaysToAdd = (0 - @RangeInDays)

     SELECT
      [ID],[Date],[Thread],[Level],[Logger],[Message],[Exception]
     FROM
      [DatabaseName].[dbo].[Log]
     WHERE
      [Date] BETWEEN
       DATEADD(dd, @DaysToAdd, GETDATE())
       AND
       DATEADD(dd,  1, GETDATE())
     ORDER BY
      [Date] DESC
 END

Enjoy, I hope this helps!

Saturday, June 20, 2015

Lazy IEnumerable file reader class: When to inherit from IDispose?

Need to read just the few couple lines from a gigantic data file? Or maybe you need a forward-only, load-only-what-you-need file reader pattern? Here is a construct I have been toying with, its a class that treats a file stream like an IEnumerable.

    Note added 8/1/15: TODO: Add constructor overload that accepts the starting filepointer position, optional ending filepointer position.

    This has the benefit of not using any resources if you never use it, allows you to incrementally read gigantic files without loading it all into memory (you might need to call the StreamReader.DiscardBufferedData() method every once in a while), and because its IEnumerable, you can write queries against it that are lazy; they don't actually execute until the run-time actually NEEDS it, such as calling the IEnumerable.ToList() or 'Count()' extensions, for example. Be careful with ToList() if the file is a gigabyte or more, as calling ToList() will cause the whole thing to be read right then.

    If instead you just need to iterate through each line only until you find what you are looking for, or use Linq and a predicate to search for a particular line that satisfies a condition, then this pattern will save your application from having the load the whole thing in memory:


public class EnumerableFileReader
{
    public FileInfo File { get { return _file; } }
    public bool FileExists { get { return _file.Exists; } }

    private FileInfo _file;

    public EnumerableFileReader(string fileLocation)
        : this(new FileInfo(fileLocation))
    {
    }

    public EnumerableFileReader(FileInfo file)
    {
        if (!file.Exists)
        {
            throw new FileNotFoundException();
        }

        _file = file;
    }

    public IEnumerable FileLines()
    {
        if (!FileExists) yield break;

        string line;
        //long internalBufferSize = 0;

        using (StreamReader reader = _file.OpenText())
        {
            while ((line = reader.ReadLine()) != null)
            {
                //if (internalBufferSize++ > 90000) {   reader.DiscardBufferedData(); internalBufferSize = 0; }
                yield return line;
            }
        }

        yield break;
    }
}

    It struck me that it might be a good idea to make the class inherit from IDisposable, so the StreamReader doesn't get left around in memory, holding a file open. Indeed; all those yield keywords make it look like the stream object will just hang around there if FileLines is never called again to finish the enumeration. However, it turns out this is probably not necessary but the answer is, as you might expect: IT DEPENDS. It depends... on how you are going to use the class. Looking into the subject, I discovered that when you use the yield keyword, the compiler generates a nested class which implements the IEnumerable, IEnumerator and IDisposable interfaces and stores all context data for you under the hood. I'm not going to drop the IL (or CIL) here, but if you are curious, just open up your IEnumerable class in ILSpy. Just make sure you change the language in the drop-down box at the top from C# to IL, otherwise it will be hidden.

    So just when is our class disposed of? Well anytime you explicitly call Dispose on the Enumerator or the Stream, which one might expect. However, this will dispose of a lot more than just the stream or the enumerator alone, but all of the associated constructs that are generated around the enumerator/yield pattern. to be disposed of. Dispose will also be called at the end of enumeration. This includes when you run out of things to enumerate, any time you use the yield break or return keyword, or the program flow leaves the using statement surrounding the Stream. Here is something I didn't know: Dispose is also called when you call the IEnumerable.First() or FirstOrDefault() extension. Also, any time you use a foreach loop around the IEnumerator, the object will get disposed after you are done looping or after you break or leave the loop.

So, in short: As long as you're using LINQ extensions or foreach loops, the disposal is taken care of for you automatically. However, if you are manually calling the Enumerator().MoveNext() on the method, then you need to call dispose yourself, or rather, implement the IDisposable pattern.

Being able to use EnumerableFileReader in a using statement/disposable pattern would likely be expected of a file reader class. You could have your dispose method set a boolean flag and then call FileLines(), and add an if statement in the while look of your FileLines() method that will yield break when the dispose flag is set to true, but cleaning up properly can be tricky if your IEnumerator has more than one or two return yield statements. I would instead suggest that that we use one of the tricks we just leaned above and just have our Dispose() function call .FirstOrDefault() on the FileLines() method:


public class EnumerableFileReader : IDisposable
{
[...]

    public void Dispose()
    {
        FileLines().FirstOrDefault();
    }

[...]
}

Saturday, June 6, 2015

Humorous Software Licence Agreement #2

Please note: I do not know what kind of legal protection against liability this licence will actually afford you. Please confer with an actual lawyer before attempting to use this in any legal capacity.



Software Usage License Agreement
Copyright (C) [copyright holders]
All rights reserved.

 * WITH NO GUARANTEES OF ANY KIND
THIS SOFTWARE IS PROVIDED:
 * AS-IS
 * NOT AS IT IS'NT
 * WITH NO WARRANTY OF ANY KIND
 * WITH NO PROMISES OF ANY KIND
 * WITH NO ASSURANCES OF ANY KIND
 * WITH A YAHOO TOOLBAR ATTACHED IF YOU DOWNLOADED IT FROM CNET

THIS INCLUDES, BUT IS NOT LIMITED TO:
 * WARRANTIES OF MERCHANTABILITY
 * GUARANTEE OF FITNESS FOR A PARTICULAR PURPOSE
 * ASSURANCES THAT IT WORKS, OR EVER WORKED, OR MIGHT WORK ANYTIME IN THE FUTURE
 * ASSURANCES THAT IT WILL NOT CAUSE YOU PSYCHOLOGICAL TRAUMA OR MENTAL HARM
 * PROMISES THAT IT WILL NOT CAUSE YOU TOTAL FINANCIAL RUIN
 * PROMISES THAT IT WILL NOT CAUSE YOU BODILY HARM OR PERSONAL INJURY
 * PROMISES THAT IT WILL NOT CAUSE YOUR SERVERS TO IGNITE IN FLAME
 * ASSURANCES THAT IT WILL NOT END THE HUMAN RACE
 * ASSURANCES THAT IT WILL NOT ANNIHILATE SPACE-TIME, AND THEREFORE EXISTENCE ITSELF
 * ASSURANCES THAT THESE WORDS ACTUALLY EXIST AND ARE NOT, IN FACT, AN ILLUSION
 
THEREFORE ANY PARTY THAT CHOOSES TO USE THIS SOFTWARE DOES SO AT THEIR OWN RISK AND
THEREFORE ASSUMES ANY RISK THAT IS DIRECTLY RELATED TO, CAUSED BY, OR ASSOCIATED WITH THE POSSESSION, READING, UNDERSTANDING, PRINTING, BURNING, USE OF, OR INTERACTION WITH THIS SOFTWARE
THEREFORE IT IS OF THE OPINION OF THE COPYRIGHT HOLDER/AUTHOR THAT THE SOFTWARE SHOULD NOT USED, BY ANYONE, FOR ANY PURPOSE, BUT DO NOT FORBID IT, AND FURTHERMORE, GRANT IT

It is hereby granted, to you, the rights to view it, use it, copy it, modify it, publish it, distribute it, sub-license it, or sell it for any purpose you see fit, subject to the following conditions:
 * The above copyright notice and this permission notice shall be included in all copies or substantial portions of the software.

UNDER NO CIRCUMSTANCES AND IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT, COURSE OF BUSINESS, LATE-NIGHT DRUNKEN PRANKS, OR ACT OF,
OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE

Want to read more? I wrote a Humorous Software Licence Agreement #1 post with a similar-themed software licence.

Friday, April 24, 2015

Install C# on Raspberry Pi 2 with Mono

Intro

We can run C# on the Raspberry Pi 2! This has made the raspberry pi very valuable in my eyes. You can use Visual Studio to compile a .exe and then RUN IT on The Pi using Mono.

What I liked about The Pi: Getting it working was dead simple. I just formatted an SD card in FAT 32, downloading and extracting a 500mb .zip file and copying the contents of the extracted folder onto the SD card. After you've accomplished that, go take a beer break, you deserved it *whew*.

I was able to just plug in the cables, put in the SD card and I running Linux on The Pi. I was using the HDMI and Ethernet ports, however. If you wish to use the tiny, headphone-jack-looking component video or a USB wireless adapter however, there will be additional steps.

Problems / Gotchas

Winforms may not play nice with Mono when on the Raspberry pi. Specifically, the problem manifests when you attempt to use a TextBox on your form, but who uses those? This bug is supposedly fixed in the latest releases, but I have yet to get it to work by just updating my mono. It is very likely that I have to REBUILD mono from the latest source on the pi, which can take several hours. I have not tested this. The bug tracker for this says it has to do with with the pi's 'hard floats', which is referring to hardware floating point calculations.

The hardware is not exactly what I would call stable. Sometimes it does not POST (the BIOS is a binary blob, so does a pi really have a POST?). Anyways, ensure your USB charger that plugs into the wall that can supply plenty of power. The websites say 700ma at a minimum. I find using a 700ma charger is not sufficient. I use a charger that can push 2A, and I dont see many problems.

How does this work?

A project called Mono. The Mono team has implemented a Common Language Runtime (CLR) that runs in the Linux and Mac environments and is coded to the ECMA-335 Common Language Infrastructure (CLI) AND covers functions and classes in the .NET framework, going all the way to 4.0 with some 4.5! Its open source, of course, and also has a compiler that follows the ECMA-334 C# Language Specification to turn C# code into CLI intermediate code that can be ran in windows as well. Remember, the CLI is just a specification that any language could (theoretically) be compiled to CLI.

Installing Mono C# on Raspberry Pi

This step was pretty straight forward as well. The instructions below are if you installed the NOOBS package which uses Raspian. If you have a different flavor of Linux, the commands below might be different.

You should already have sudo installed on your raspb pi. Prefixing a command with sudo allows you to run processes under a higher privilege without having to log into root. Also if you didnt know, the default username/password to log in the first time is pi/raspberry. If you prefer, you can type 'startx' to bring up the xwindows GUI. The commands below need to be entered into a terminal, which can be brought up within the GUI.

Fist, you want to update your Raspian to the latest version. To do that, you have to have the raspberry pi hooked up the the internet. If you have an ethernet cable and the router uses DHCP, plugging in the ethernet will be sufficient. Then, at the console simply enter the below line into the console:

sudo apt-get update

That may take a few minutes to complete and might prompt you to hit Y/N regarding the size of the package.

sudo apt-get install mono-runtime

This will also will take a few minutes to complete and might prompt you to hit Y/N regarding the size of the package.

After you've accomplished that, go have another beer because that was hard work!

Mono can now be used to run .NET executables:

mono Test.exe

Or, if your program requires elevated rights:

sudo mono Test.exe

Mono also features a REPL (read eval print loop). To access it, type 'csharp'. You can use your favorite text editor to make .cs files and compile them. My recommendation is, of course, to use Microsoft's Visual Studio. You can build and compile an application with MSVS in windows, and just transfer the executable by SCP or thumb-drive to the raspberry pi, which is pretty slick. Hats off to the developer(s) of Mono.

Saturday, April 18, 2015

Hindenbugs, Heisenbugs and other types of software bugs

Different Types of Software Bugs

Sorry, I haven't gotten around to creating a new post in close to a month (yikes!). I have just been super busy at work and have been trying to complete about 3 out of several not-quite-finished personal projects that I will be releasing on GitHub and blogging about

Heisenbug--

1) A software bug that disappears or alters its behavior by the action of attempting to debug it.

Possible sources: The effect of debuggers on the run-time code or JIT can be one source. Another might arise because of process scheduling or threading and race conditions. Or both, as stepping through the code with a debugger will affect the timing of threads.

Derivation: Werner Heisenberg and his Uncertainty Principle.

Mandelbug--

1) A bug whose causes are so complex it defies repair, or makes its behavior appear chaotic or even non-deterministic. Other symptoms include core dumps that, when viewed in an editor, form complex but repeating patterns or designs of characters and symbols.

Possible sources: Operating system environment or configuration, the presence or absence of file system objects, time or scheduling-dependent code. Also, a very complicated state machine esp. one with more than one variable that hold state information and where transitions may not exists for every possible permutation of one state to another.

Derivation: Benoît Mandelbrot and his research in fractal geometry.

Schrödinbug--

1) A bug that manifests itself in running software after a programmer notices that the code should never have worked in the first place.

Possible sources: A combination of naughty coding practices such as coding by coincidence and something else that allows for unexpected behavior such as a the effect of custom configurations or compiler optimizations, code obfuscation, or the JIT or on code. Also, a dependency on some arcane piece of technology that no one truly understands or employment of a black-box system or library, or some ancient, dinosaur system with a highly specific configuration that is so fragile that no-one dares update it because said updates could likely or is known to break functionality and where even a reboot is considered hairy.

Derivation: Named for Erwin Schrödinger and his thought experiment.

Bohrbug--

1) In contrast to the Heisenbug, the Bohrbug, is a "good, solid bug", easy to hunt down, or easily predicted from the description, esp. when a bug is the result of a classical programming mistake, or 'a rookie move'.

Possible sources: Failing to initialize a variable or class to anything other than null, failure to RTFM, misuse of pointers or general abuse of memory, not coding defensively, coding by coincidence or not truly understanding how (or why) your code works.

Derivation: Named after the physicist Niels Bohr and his rather simple atomic model.

Hindenbug--

1) A bug with catastrophic consequences, esp. one that actually causes the server to burst into bright, hot flames.

Possible sources: Code that dynamically builds and executes SQL scripts, esp. scripts that make use of the DELETE FROM command, code that is self modifying or replicates, or code that controls large machines or a physical system such as pumps, belts, cooling systems, aggregate pulverizer, fans, or any sort of thing with large, rotating blades. Also, any software that controls, monitors, tests, or is in any way whatsoever, wired up to a tactical nuclear defense system. In fact, lets just include anything with 'nuclear' or 'pulverize' in the name.

Derivation: Refers to the Hindenburg disaster. The Hindenburg Blimp was, in turn, named after Paul von Hindenburg, the then-president of Germany from 1925->1934.

Kishor S. Trivedi has a great talk/slideshow about Software Faults, Failures and Their Mitigations. He posits that it is not realistic to write software that is 100% bug free, or have 100% up-time. He displays the downtime in terms of minutes per year (MPY?) of several reliable corporations to support his claims, and it is true that as far as I am aware,there have not been any software ever written that did not have some downtime, even NASA.

However, Trivedi shares a quote by one E. W. Dijkstra: "Testing shows the presence, not the absence, of bugs". Aww, logic. Now that's something I can get behind. Indeed; It is impossible to provably show that any software (of sufficient complexity) is absolutely error-free.

Following this, he suggests we should not strive for the virtually impossible task building fault-free software, but rather aim to build instead fault-tolerant software.

Software Faults -> Software Errors -> Software Failures

Software Faults lead to Software Errors that lead to Software Failures.
Trivedi defines faults, errors and failures:
      -Software Failure occurs when the delivered service no longer complies with the desired output.
      -Software Error is that part of the system state which is liable to lead to subsequent failure.
      -Software Fault is adjudged or hypothesized cause of an error.Faults are the cause of errors that may lead to failures

    So, how does one test their tactical nuclear defense system code? VERY carefully, and it probably wouldn't hurt to comment out the Launch() method as well.

Wednesday, February 18, 2015

The Feature Flags Pattern

I was listening to Episode 1101 of the podcast Dot Net Rocks. Jez Humble was talking about the concept of Feature Toggles or Feature Flags. Feature Switches? While the term has some opinionated definitions, the concept that I found most interesting was the idea of deploying software with the new features initially disabled, or switched off by some mechanism. After you think the feature is ready for production, switch on the feature. If there is an issue, you don't have to roll back the version or deploy another release, just switch the feature back off again and replace its .dll. Didn't get enough debug information? Switch the feature back on and let a few crash reports trickle in and then switch it back off. If you feature is particularly processor or network intensive, you can perform load testing by slowly releasing the features to select clients or only part of the population/userbase.

While I personally choose an SQL table for my approach to storing the toggle switches (internal business app), one could use the application's .config file. An application could pull the settings from a .config file on a networked drive as a way of controlling multiple application instances by making one changed to a centralized location. Below I show a minimalist implementation by creating a Dictionary from the <appSettings> in a App.config file.

Behold:


public static Dictionary<string, bool> GetFeatureFlags()
{
  return ConfigurationManager.AppSettings.AllKeys.ToDictionary(s => s, IsFlagSet);
}

public static bool IsFlagSet(string settingName)
{
  bool result = false;
  bool.TryParse(ConfigurationManager.AppSettings[settingName], out result);
  return result;
}

Of course with a dictionary you have to be careful that you don't try an access a column that does not exist with the indexer, so you might be better off using IsFlagSet(string), which will never throw. Although this is of limited use (AppSettings is already a NameValueCollecion), perhaps you can make use of this generic function I wrote that uses generics to convert AppSettings into a dictionary with the type of the value being the generic:


public static Dictionary<string, T> GetDictionary<T>()
{
 return ConfigurationManager.AppSettings.AllKeys.ToDictionary<string, string, T>(s => s, GetSetting<T>);
}

public static T GetSetting<T>(string settingName)
{
  try
  {
    T result = (T) Convert.ChangeType(
      ConfigurationManager.AppSettings[settingName],
      typeof (T)
    );
    if (result != null)
    {
      return result;
    }
  }
  catch { }

  return default(T);
}

Please note that swallowing an exception ("catch { }") is typically considered poor form. In this particular scenario, I am aware of the possible exceptions that can be thrown by this code and I want the code to return the default(T) in this scenario and never throw.

Wednesday, February 4, 2015

EditLabel UserControl

EditLabel is editable Label control, who's text can be modified at run-time by double-clicking on it. Under the hood, it contains a Label control and TextBox control. It works like a Label control until you double-click it with the mouse. When double-clicked it enters 'edit-mode' where a TextBox overlays the Label. The control will AutoSize to the characters you type in the TextBox, updating the Label's size at the same time. You can leave the control's edit mode by hitting either enter, the escape key, or letting the control lose focus. After editing, the control acts like a label again, with the updated text. You can check out the code for the EditLabel project here on my GitHub.

The EditLabel isn't too interesting by itself, but I wanted to blog about it for all the interesting attributes that I never knew about until I started getting into designing controls, and controlling how controls are displayed in the designer window in visual studio.

The first thing my EditLabel control needed was and event that the parent form could subscribe to know when the text changed. Creating the event simply isn't enough. If you want the event to show up in the Visual Studio's Form Designer when you click on events, you need to add the Browsable and EditorBrowsable attribute. This will make is show up in the designer. You can control the description that appears as well as what category it shows up under when viewing events/properties in the Categorize view as opposed to Alphabetical view by using the Description and Category attributes respectively:


[Category("Property Changed")]
[Description("Event raised when the value of the Text property is changed on Control.")]
[Browsable(true), EditorBrowsable(EditorBrowsableState.Always)]
public new event EventHandler TextChanged;

There is also the notion of default events. That is, the event for which a handler is added when you double click the control in the designer. For example, when you double click a TextBox, it automatically adds the TextChanged event, instead of, say, the KeyDown event. To make an event the default event, add the DefaultEvent attribute at the control's class-level declaration:


[DefaultEvent("TextChanged")]
public partial class EditLabel : UserControl
{
   ...

Another thing I wanted was the Text property. Like a Label or a TextBox control, I wanted a public Text property that could be edited at design time. However, every time I re-built the project, the text would get reset to the default. As it turns out, if you want modifications to properties to be persisted, it must be serialized to the MainForm.Designer.cs, or whatever the auto-generated Designer.cs file is. In order to tell the designer to serialize changes made at design time to Designer.cs file, you have to add the DesignerSerializationVisibility attribute above the property. Here is the attributes and declaration for the Text property:


[Category("Appearance")]
[Description("The text associated with the control.")]
[Browsable(true), EditorBrowsable(EditorBrowsableState.Always)]
[DesignerSerializationVisibility(DesignerSerializationVisibility.Visible)]
public override string Text
{
   get { return ctrlLabel.Text; }
   set { ChangeText(value); }
}

You can view the full EditLabel.cs here, or view the entire EditLabel project here.

Thursday, January 22, 2015

Graceful Console Exit

When writing console apps sometimes you want to delay the closing of the console window to give the user time to read the output. Often, if something went wrong, I would simply report the error to the standard output and immediately exit the application. However that did not work too well for my users; the exit message was not displayed long enough for them to read so they knew how to correct the issue. What I needed was something that give a user enough time to read the output, but not prevent the process from exiting if it was being ran as a scheduled/automated task. The solution I came up with was a countdown timer that would block execution from proceeding until it counted back from some number to zero.

Lets view the code, shall we:


public static void CountdownTimer(int Seconds = 5)
{
    Console.ResetColor();
    Console.WriteLine(Environment.NewLine);
    Console.Write("The application will now terminate...   ");

    if (Seconds < 1) return;
    if (Seconds > 9) Seconds = 9;

    for (int counter = Seconds * 10; counter > 0; counter--)
    {
        if (Console.KeyAvailable)
            return;

        if (counter % 10 == 0)
            Console.Write("\b{0}", (counter / 10));

        System.Threading.Thread.Sleep(100);
    }
}

I print each number as the timer counts down. Each time I print a new number, I place a backspace "\b" before it to erase the previous number. To avoid having to keep track how many character a digit prints to use the same number of backspaces, I just kept the function simple by limiting the number of seconds to a max of 9.
I check for Console.KeyAvailable to detect key strokes to skip the rest of the countdown and exit immediately. The reason I did 10 Sleeps of 100ms per second was to make the exit upon key press more responsive.

Tuesday, January 20, 2015

Validate all input parameters in one line/Check several objects for empty or null in a single method call.

Often my methods start with several guarding clauses. That is, conditional if statements that check for null or empty parameters and immediately return if they are. This is also known as defensive programming. Usually I am not concerned with this code; indeed I identify these blocks as validation code and overlook this code entirely. It was not until recently that I noticed this as an area that I was repeating myself and could be put in a reusable function.

Lets look at the code:


/// <summary>
/// Checks the parameters for empty, nulls, or invalid states.
/// </summary>
/// <returns>True if the params are null, empty, contains an array or object that is null or empty, contains a blank, whitespace, null or empty string, or contains DataTable that does not pass a call to IsValidDatatable().</returns>
public static bool ContainsNullOrEmpty(params object[] Items)
{
    if (Items == null || Items.Length < 1)
        return true;
    
    foreach (object item in Items)
    {
        if (item == null)
            return true;
        
        if (item is string)
        {
            if (string.IsNullOrWhiteSpace(item as String))
                return true;
        }
        else if (item is DataTable)
        {
            if (!IsValidDatatable(item as DataTable))
                return true;
        }
        
        if (item.GetType().IsArray)
        {
            bool isEmpty = true;
            foreach (object itm in (Array)item)
            {
                if (ContainsEmptyOrNulls(itm))
                    return true;
                
                isEmpty = false;
            }
            if (isEmpty)
                return true;
        }
    }

    return false;
}

My approach above uses the params keyword. By using params, I can pass in any number of parameters (including zero, although that doesn't help us in this context). By using an object instead of a generic type I can pass multiple different types in one method call. If I used generics, I would have to have one method call for each type that I wanted to validate.

The idea is to cram all of your common validation logic into this method, and call it everywhere to increase the readability of your business logic by not cluttering it up with validation logic. Notice how this function also checks for empty or white-space strings, as well as calling a custom IsValidDatatable(DataTable) function. If you have several functions that return an int of -1 or 0 upon failure, you might want to add another conditional to check if (item is int) and then if the value of the integer reflects an erroneous state.

Another cool feature it that if the item is an an array, or even an array of nested arrays, it will still check every item in those arrays. Notice how I get the item type, then check the Type.IsArray property boolean. If it is true, I cast the object as an System.Array, then call ContainsEmptyOrNulls() recursively in a foreach loop. We can return true right away on the first null condition met, but we must be careful not to return on a false condition and to instead let the false conditions fall through and continue on, in the case that there is a null later or in another array.

Enjoy!

Tuesday, January 13, 2015

EntityJustWorks - C# class/object/entity to SQL database/script/command mapper.

Note by author:

   Since writing this, I have expanded on this idea quite a bit. I have written a lightweight ORM class library that I call EntityJustWorks.
   The full project can be found on GitHub or CodePlex.

   EntityJustWorks not only goes from a class to DataTable (below), but also provides:

Populate the public properties of a class from a DataTable

DataTable to class as a C# code file (using CodeDOM)

DataTable to class Assembly (using Reflection.Emit)

C# class to a SQL 'CREATE TABLE' or 'INSERT INTO' script

SQL 'SELECT' statement to a List<T> of populated classes, each one resembling a row

Security Warning:
This library generates dynamic SQL, and has functions that generate SQL and then immediately executes it. While it its true that all strings funnel through the function Helper.EscapeSingleQuotes, this can be defeated in various ways and only parameterized SQL should be considered SAFE. If you have no need for them, I recommend stripping semicolons ; and dashes --. Also there are some Unicode characters that can be interpreted as a single quote or may be converted to one when changing encodings. Additionally, there are Unicode characters that can crash .NET code, but mainly controls (think TextBox). You almost certainly should impose a white list:
string clean = new string(dirty.Where(c => "abcdefghijklmnopqrstuvwxyz0123456789.,\"_ !@".Contains(c)).ToArray());

PLEASE USE the SQLScript.StoredProcedure and DatabaseQuery.StoredProcedure classes to generate SQL for you, as the scripts it produces is parameterized. All of the functions can be altered to generate parameterized instead of sanitized scripts. Ever since people have started using this, I have been maintaining backwards compatibility. However, I may break this in the future, as I do not wish to teach one who is learning dangerous/bad habits. This project is a few years old, and its already showing its age. What is probably needed here is a total re-write, deprecating this version while keep it available for legacy users after slapping big warnings all over the place. This project was designed to generate the SQL scripts for standing up a database for a project, using only MY input as data. This project was never designed to process a USER'S input.! Even if the data isn't coming from an adversary, client/user/manually entered data is notoriously inconsistent. Please do not use this code on any input that did not come from you, without first implementing parameterization. Again, please see the SQLScript.StoredProcedure class for inspiration on how to do that.

    In this post, I bring the wagons 'round full circle and use what I have shown you in older posts to make a class that can create (or insert into) a table in an SQL database from a C# class object at run-time. The only prior knowledge needed would be the connection string. The name of the the table will be the name of the class Type, and the tables's columns are created from the class's public properties, with their names being the property names.

    If you want to take a data-first approach, you can generate the code to C# classes that mirrors a table in a relational database given a connection string and the table name, or use the new Emit class and generate a dynamic assembly at run-time that contains a class with public auto-properties matching the DataColumns that you can use to store DataRows in. Similarly, the name of class name will be the same name as the table, with each of the table's columns being represented by a public auto property on the class of the same name and type. ~~An improvement here would be to create the class inside a dynamic assembly using System.Reflection.Emit.~~ Anywho, I have been holding on to this code for too long I just want to get it out of the door.

    For those of you who just want the code:
  GitHub (Download zip)
BitBucket (Download)
code.msdn.microsoft.com

    Going from a query to a DataTable is as simple as a call to SQLAdapter.Fill(). For populating the public properties of a class from a DataTable, see this post I made. To build a DataTable who's columns match the public properties of a class, check out this post here. Going from a DataTable to a class as a C# code file is covered in this previous post.

    Going from a DataTable to SQL CREATE TABLE or INSERT INTO scripts that can be passed to ExecuteNonQuery() to execute the command, see below.

    Here is the INSERT INTO code. This code takes advantage of some functions in the Helper class that converts a DataTable's ColumnNames and Row's Values as a List of String, which is then passed into a function that takes a list of string and formats it with a delimiter string and prefix and postfix strings so I can easily format the list to look like the COLUMNS and VALUES part of an SQL INSERT INTO statement. For example: INSERT INTO ([FullName], [Gender], [BirthDate]) VALUES ('John Doe', 'M', '10/3/1981''). If you want to view how the functions RowToColumnString(), RowToValueString(), and ListToDelimitedString() work, click here to view Helper.cs.


public static string InsertInto<T>(params T[] ClassObjects) where T : class
{
   DataTable table = Map.ClassToDatatable<T>(ClassObjects);
   return InsertInto(table);   // We don't need to check IsValidDatatable() because InsertInto does
}

public static string InsertInto(DataTable Table)
{
   if (!Helper.IsValidDatatable(Table))
      return string.Empty;

   StringBuilder result = new StringBuilder();
   foreach (DataRow row in Table.Rows)
   {
      if (row == null || row.Table.Columns.Count < 1 || row.ItemArray.Length < 1)
         return string.Empty;

      string columns = Helper.RowToColumnString(row);
      string values = Helper.RowToValueString(row);

      if (string.IsNullOrWhiteSpace(columns) || string.IsNullOrWhiteSpace(values))
         return string.Empty;

      result.AppendFormat("INSERT INTO [{0}] {1} VALUES {2}", Table.TableName, columns, values);
   }

   return result.ToString();
}

Here is the CREATE TABLE CODE. A very important function here is the GetDataTypeString() member, which converts the Column.DataType to the SQL equivalent as a string, which is used to craft a CREATE TABLE script. I use static strings formatted to work with String.Format to replace the crucial bits in the CREATE TABLE command:


public static string CreateTable<T>(params T[] ClassObjects) where T : class
{
 DataTable table = Map.ClassToDatatable<T>(ClassObjects);
 return Script.CreateTable(table);
}

public static string CreateTable(DataTable Table)
{
   if (Helper.IsValidDatatable(Table, IgnoreRows: true))
      return string.Empty;

   StringBuilder result = new StringBuilder();
   result.AppendFormat("CREATE TABLE [{0}] ({1}", Table.TableName, Environment.NewLine);

   bool FirstTime = true;
   foreach (DataColumn column in Table.Columns.OfType<DataColumn>())
   {
      if (FirstTime) FirstTime = false;
      else
         result.Append(",");

      result.AppendFormat("[{0}] {1} {2}NULL{3}",
         column.ColumnName,
         GetDataTypeString(column.DataType),
         column.AllowDBNull ? "" : "NOT ",
         Environment.NewLine
      );
   }
   result.AppendFormat(") ON [PRIMARY]{0}GO", Environment.NewLine);

   return result.ToString();
}

private static string GetDataTypeString(Type DataType)
{
   switch (DataType.Name)
   {
      case "Boolean": return "[bit]";
      case "Char": return "[char]";
      case "SByte": return "[tinyint]";
      case "Int16": return "[smallint]";
      case "Int32": return "[int]";
      case "Int64": return "[bigint]";
      case "Byte": return "[tinyint] UNSIGNED";
      case "UInt16": return "[smallint] UNSIGNED";
      case "UInt32": return "[int] UNSIGNED";
      case "UInt64": return "[bigint] UNSIGNED";
      case "Single": return "[float]";
      case "Double": return "[double]";
      case "Decimal": return "[decimal]";
      case "DateTime": return "[datetime]";
      case "Guid": return "[uniqueidentifier]";
      case "Object": return "[variant]";
      case "String": return "[nvarchar](250)";
      default: return "[nvarchar](MAX)";
   }
}

...

Here is definition of all the public members:


namespace EntityJustWorks.SQL
{
   public static class Convert
   {
      public static string SQLTableToCSharp(string ConnectionString, string TableName);
      public static bool ClassToSQLTable(string ConnectionString, params T[] ClassCollection);
   }
   
   public static class Code
   {
      public static string SQLTableToCSharp(string ConnectionString, string TableName);
      public static string DatatableToCSharp(DataTable Table);
   }

   public static class Script
   {
      public static string InsertInto<T>(params T[] ClassObjects);
      public static string InsertInto(DataTable Table);
      public static string CreateTable<T>(params T[] ClassObjects);
      public static string CreateTable(DataTable Table);
   }
   
   public static class Map
   {
      public static IList<T> DatatableToClass<T>(DataTable Table);
      public static DataTable ClassToDatatable<T>(params T[] ClassCollection);
      public static DataTable ClassToDatatable<T>();
   }

   public static class Query
   {
      public static IList<T> QueryToClass<T>(string ConnectionString, string FormatString_Query,
                                                            params object[] FormatString_Parameters);
      public static DataTable QueryToDataTable(string ConnectionString, string FormatString_Query,
                                                            params object[] FormatString_Parameters);
      public static T QueryToScalarType<T>(string ConnectionString, string FormatString_Query,
                                                            params object[] FormatString_Parameters);
      public static int ExecuteNonQuery(string ConnectionString, string FormatString_Command,
                                                            params object[] FormatString_Parameters);
   }
   
   public static class Helper
   {
      public static bool IsValidDatatable(DataTable Table, bool IgnoreRows = false);
      public static bool IsCollectionEmpty<T>(IEnumerable<T> Input);
      public static bool IsNullableType(Type Input);
   }
}

Please feel free to comment, either on my GitHub or my blog, with criticisms or suggestions.

Tags