Item classification problem using ML.net with Naive Bayes

Item classification problem using ML.net with Naive Bayes - c#

I am new to machine learning and ML.NET. I want to solve a task about Excel column identification.
Columns in Excel like 序号, 编号, 编码, 名称, 项目名称, and for each column, there is a corresponding field name as following:
Column_Field.csv
Column
FieldName
序号
OrdCode
编号
OrdCode
编码
OrdCode
名称
Name
项目名称
Name
Each field may have one or more than one column names, such as 序号, 编号, 编码 for OrdCode. And the task is to try to identify or find the corresponding field name for an incoming column.
Based on the above dataset, I use ML.NET, and want to predict the right field for columns that are read from an Excel file.
I use Naive Bayes algorithm. The code:
public class Program
{
private static readonly string _dataPath = Path.Combine(Environment.CurrentDirectory, "Data", "Column_Field.csv");
private static void Main(string[] args)
{
MLContext mlContext = new MLContext();
IDataView dataView = mlContext.Data.LoadFromTextFile<ColumnInfo>(_dataPath, hasHeader: true, separatorChar: '\t');
var pipeline = mlContext.Transforms.Conversion.MapValueToKey(inputColumnName: "Label", outputColumnName: "Label")
.Append(mlContext.Transforms.Text.FeaturizeText(outputColumnName: "Features", inputColumnName: "Column"))
.Append(mlContext.MulticlassClassification.Trainers.NaiveBayes())
.Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));
var model = pipeline.Fit(dataView);
// evaluate
//List<ColumnInfo> dataForEvaluation = new List<ColumnInfo>()
//{
// new ColumnInfo{ Column="名称", FieldName="Name" },
// new ColumnInfo{ Column="<名称>", FieldName="Name" },
// new ColumnInfo{ Column="序号", FieldName="OrdName" },
//};
//IDataView testDataSet = mlContext.Data.LoadFromEnumerable(dataForEvaluation);
//var metrics = mlContext.MulticlassClassification.Evaluate(testDataSet);
//Console.WriteLine($"MicroAccuracy: {metrics.MicroAccuracy:P2}");
//Console.WriteLine($"MicroAccuracy: {metrics.MicroAccuracy:P2}");
// predict
var dataForPredictation = new List<ColumnInfo>();
dataForPredictation.Add(new ColumnInfo { Column = "名称" });
dataForPredictation.Add(new ColumnInfo { Column = "ABC" });
dataForPredictation.Add(new ColumnInfo { Column = "名" });
var engine = mlContext.Model.CreatePredictionEngine<ColumnInfo, Predication>(model);
foreach (var data in dataForPredictation)
{
var result = engine.Predict(data);
Console.WriteLine($"{data.Column}: \t{result.FieldName}");
}
Console.ReadLine();
}
}
public class ColumnInfo
{
[LoadColumn(0)]
public string Column { get; set; }
[LoadColumn(1), ColumnName("Label")]
public string FieldName { get; set; }
}
public class Predication
{
[ColumnName("PredictedLabel")]
public string FieldName { get; set; }
}
However, the result is not as expected.
Result:
名称: OrdCode
ABC: OrdCode
名: OrdCode
So what is wrong with the code? I suppose the problem may be lacking of proper processing the data in the pipeline before training.
Thanks.

Related

CsvHelper PrepareHeaderForMatch returns Context as one-item array

Been using CsvHelper version 6.0.0, decided to upgrade to latest (currently 12.3.2) and found out it uses another parameter, index in lambda for csv.Configuration.PrepareHeaderForMatch, (Func<string,int,string>).
The code for v6.0.0 looked like this:
csv.Configuration.PrepareHeaderForMatch = header => Regex.Replace(header, #"\/", string.Empty);
With previous line, the IReadingContext.Record returns an array with multiple records, one for each column.
The code for v12.3.2 looks like this:
csv.Configuration.PrepareHeaderForMatch = (header, index) => Regex.Replace(header, #"\/", string.Empty);
But ReadingContext.Record now returns an array with all columns in just one record. Used the exact same file for both versions. Tried messing with the lambda, but the outcome is the same. How can I get the columns in Records array?
Thanks in advance!

update - This is an issue with the delimiter that has changed since version 6.0.0. The default delimiter now uses CultureInfo.CurrentCulture.TextInfo.ListSeparator. Since I'm in the United States, my ListSeparator is , so both examples work for me. For many countries the ListSeparator is ; which is why for version 12.3.2 only 1 column was found for #dzookatz. The solution is to specify the delimiter in the configuration.
csv.Configuration.PrepareHeaderForMatch = header => Regex.Replace(header, #"\/", string.Empty);
csv.Configuration.Delimiter = ",";
I must be missing something. I get the same result for var record whether using version 6.0.0 or 12.3.2. I'm guessing there is more going on with your data that I'm not seeing.
Version 6.0.0
class Program
{
static void Main(string[] args)
{
var fooString = $"Id,First/Name{Environment.NewLine}1,David";
using (var reader = new StringReader(fooString))
using (var csv = new CsvReader(reader))
{
csv.Configuration.PrepareHeaderForMatch = header => Regex.Replace(header, #"\/", string.Empty);
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
var record = csv.Context.Record;
}
}
}
}
public class Foo
{
public int Id { get; set; }
public string FirstName { get; set; }
}
Version 12.3.2
public class Program
{
public static void Main(string[] args)
{
var fooString = $"Id,First/Name{Environment.NewLine}1,David";
using (var reader = new StringReader(fooString))
using (var csv = new CsvReader(reader))
{
csv.Configuration.PrepareHeaderForMatch = (header, index) => Regex.Replace(header, #"\/", string.Empty);
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
var record = csv.Context.Record;
}
}
}
}
public class Foo
{
public int Id { get; set; }
public string FirstName { get; set; }
}

Add custom column to IDataView in ML.NET

I'd like to add a custom column after loading my IDataView from file.
In each row, the column value should be the sum of previous 2 values. A sort of Fibonacci series.
I was wondering to create a custom transformer but I wasn't able to find something that could help me to understand how to proceed.
I also tried to clone ML.Net Git repository in order to see how other transformers were implemented but I saw many classes are marked as internal so I cannot re-use them in my project.

There is a way to create a custom transform with CustomMapping
Here's an example I used for this answer.
The input and output classes:
class InputData
{
public int Age { get; set; }
}
class CustomMappingOutput
{
public string AgeName { get; set; }
}
class TransformedData
{
public int Age { get; set; }
public string AgeName { get; set; }
}
Then, in the ML.NET program:
MLContext mlContext = new MLContext();
var samples = new List<InputData>
{
new InputData { Age = 16 },
new InputData { Age = 35 },
new InputData { Age = 60 },
new InputData { Age = 28 },
};
var data = mlContext.Data.LoadFromEnumerable(samples);
Action<InputData, CustomMappingOutput> mapping =
(input, output) =>
{
if (input.Age < 18)
{
output.AgeName = "Child";
}
else if (input.Age < 55)
{
output.AgeName = "Man";
}
else
{
output.AgeName = "Grandpa";
}
};
var pipeline = mlContext.Transforms.CustomMapping(mapping, contractName: null);
var transformer = pipeline.Fit(data);
var transformedData = transformer.Transform(data);
var dataEnumerable = mlContext.Data.CreateEnumerable<TransformedData>(transformedData, reuseRowObject: true);
foreach (var row in dataEnumerable)
{
Console.WriteLine($"{row.Age}\t {row.AgeName}");
}

Easy thing. I am assuming, you know how to use pipelines.
This is a part of my project, where I merge two columns together:
IEstimator<ITransformer> pipeline = mlContext.Transforms.CustomMapping(mapping, contractName: null)
.Append(mlContext.Transforms.Text.FeaturizeText(inputColumnName: "question1", outputColumnName: "question1Featurized"))
.Append(mlContext.Transforms.Text.FeaturizeText(inputColumnName: "question2", outputColumnName: "question2Featurized"))
.Append(mlContext.Transforms.Concatenate("Features", "question1Featurized", "question2Featurized"))
//.Append(mlContext.Transforms.NormalizeMinMax("Features"))
//.AppendCacheCheckpoint(mlContext)
.Append(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: nameof(customTransform.Label), featureColumnName: "Features"));
As you can see the two columns question1Featurized and question2Featurized are combined into Features which will be created and can be used as any other column of IDataView. The Features column does not need to be declared in a separate class.
So in your case you should transform the columns firs in their data type, if strings you can do what I did and in case of numeric values use a custom Transformer/customMapping.
The documentation of the Concatenate function might help as well!

Validate content of csv file c#

I have a requirement where user will be uploading a csv file in the below format which will contain around 1.8 to 2 million records
SITE_ID,HOUSE,STREET,CITY,STATE,ZIP,APARTMENT
44,545395,PORT ROYAL,CORPUS CHRISTI,TX,78418,2
44,608646,TEXAS AVE,ODESSA,TX,79762,
44,487460,EVERHART RD,CORPUS CHRISTI,TX,78413,
44,275543,EDWARD GARY,SAN MARCOS,TX,78666,4
44,136811,MAGNOLIA AVE,SAN ANTONIO,TX,78212
What i have to do is, first validate the file and then save it in database iff its validated successfully and has no errors. The validations that i have to apply are different for each column. For example,
SITE_ID: it can only be an integer and it is required.
HOUSE: integer, required
STREET: alphanumeric, required
CITY: alphabets only, required
State: 2 alphabets only, required
zip: 5 digits only, required
APARTMENT: integer only, optional
I need a generic way of applying these validations to respective columns. What i have tried so far is that i converted the csv file to dataTable and i plan to try and validate each cell through regex but this doesn't seem like a generic or good solution to me. Can anyone help me in this regard and point me to the right direction?

Here is one efficient method :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Data.OleDb;
using System.Text.RegularExpressions;
using System.IO;
namespace ConsoleApplication23
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
CSVReader csvReader = new CSVReader();
DataSet ds = csvReader.ReadCSVFile(FILENAME, true);
RegexCompare compare = new RegexCompare();
DataTable errors = compare.Get_Error_Rows(ds.Tables[0]);
}
}
class RegexCompare
{
public static Dictionary<string,RegexCompare> dict = new Dictionary<string,RegexCompare>() {
{ "SITE_ID", new RegexCompare() { columnName = "SITE_ID", pattern = #"[^\d]+", positveNegative = false, required = true}},
{ "HOUSE", new RegexCompare() { columnName = "HOUSE", pattern = #"[^\d]+", positveNegative = false, required = true}},
{ "STREET", new RegexCompare() { columnName = "STREET", pattern = #"[A-Za-z0-9 ]+", positveNegative = true, required = true}},
{ "CITY", new RegexCompare() { columnName = "CITY", pattern = #"[A-Za-z ]+", positveNegative = true, required = true}},
{ "STATE", new RegexCompare() { columnName = "STATE", pattern = #"[A-Za-z]{2}", positveNegative = true, required = true}},
{ "ZIP", new RegexCompare() { columnName = "ZIP", pattern = #"\d{5}", positveNegative = true, required = true}},
{ "APARTMENT", new RegexCompare() { columnName = "APARTMENT", pattern = #"\d*", positveNegative = true, required = false}},
};
string columnName { get; set;}
string pattern { get; set; }
Boolean positveNegative { get; set; }
Boolean required { get; set; }
public DataTable Get_Error_Rows(DataTable dt)
{
DataTable dtError = null;
foreach (DataRow row in dt.AsEnumerable())
{
Boolean error = false;
foreach (DataColumn col in dt.Columns)
{
RegexCompare regexCompare = dict[col.ColumnName];
object colValue = row.Field<object>(col.ColumnName);
if (regexCompare.required)
{
if (colValue == null)
{
error = true;
break;
}
}
else
{
if (colValue == null)
continue;
}
string colValueStr = colValue.ToString();
Match match = Regex.Match(colValueStr, regexCompare.pattern);
if (regexCompare.positveNegative)
{
if (!match.Success)
{
error = true;
break;
}
if (colValueStr.Length != match.Value.Length)
{
error = true;
break;
}
}
else
{
if (match.Success)
{
error = true;
break;
}
}
}
if(error)
{
if (dtError == null) dtError = dt.Clone();
dtError.Rows.Add(row.ItemArray);
}
}
return dtError;
}
}
public class CSVReader
{
public DataSet ReadCSVFile(string fullPath, bool headerRow)
{
string path = fullPath.Substring(0, fullPath.LastIndexOf("\\") + 1);
string filename = fullPath.Substring(fullPath.LastIndexOf("\\") + 1);
DataSet ds = new DataSet();
try
{
if (File.Exists(fullPath))
{
string ConStr = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}" + ";Extended Properties=\"Text;HDR={1};FMT=Delimited\\\"", path, headerRow ? "Yes" : "No");
string SQL = string.Format("SELECT * FROM {0}", filename);
OleDbDataAdapter adapter = new OleDbDataAdapter(SQL, ConStr);
adapter.Fill(ds, "TextFile");
ds.Tables[0].TableName = "Table1";
}
foreach (DataColumn col in ds.Tables["Table1"].Columns)
{
col.ColumnName = col.ColumnName.Replace(" ", "_");
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
return ds;
}
}
}

Here's a rather overengineered but really fun generic method, where you give attributes to your class to match them to CSV column headers:
First step is to parse your CSV. There are a variety of methods out there, but my favourite is the TextFieldParser that can be found in the Microsoft.VisualBasic.FileIO namespace. The advantage of using this is that it's 100% native; all you need to do is add Microsoft.VisualBasic to the references.
Having done that, you have the data as List<String[]>. Now, things get interesting. See, now we can create a custom attribute and add it to our class properties:
The attribute class:
[AttributeUsage(AttributeTargets.Property)]
public sealed class CsvColumnAttribute : System.Attribute
{
public String Name { get; private set; }
public Regex ValidationRegex { get; private set; }
public CsvColumnAttribute(String name) : this(name, null) { }
public CsvColumnAttribute(String name, String validationRegex)
{
this.Name = name;
this.ValidationRegex = new Regex(validationRegex ?? "^.*$");
}
}
The data class:
public class AddressInfo
{
[CsvColumnAttribute("SITE_ID", "^\\d+$")]
public Int32 SiteId { get; set; }
[CsvColumnAttribute("HOUSE", "^\\d+$")]
public Int32 House { get; set; }
[CsvColumnAttribute("STREET", "^[a-zA-Z0-9- ]+$")]
public String Street { get; set; }
[CsvColumnAttribute("CITY", "^[a-zA-Z0-9- ]+$")]
public String City { get; set; }
[CsvColumnAttribute("STATE", "^[a-zA-Z]{2}$")]
public String State { get; set; }
[CsvColumnAttribute("ZIP", "^\\d{1,5}$")]
public Int32 Zip { get; set; }
[CsvColumnAttribute("APARTMENT", "^\\d*$")]
public Int32? Apartment { get; set; }
}
As you see, what I did here was link every property to a CSV column name, and give it a regex to validate the contents. On non-required stuff, you can still do regexes, but ones that allow empty values, as shown in the Apartment one.
Now, to actually match the columns to the CSV headers, we need to get the properties of the AddressInfo class, check for each property whether it has a CsvColumnAttribute, and if it does, match its name to the column headers of the CSV file data. Once we have that, we got a list of PropertyInfo objects, which can be used to dynamically fill in the properties of new objects created for all rows.
This method is completely generic, allows giving the columns in any order in the CSV file, and parsing will work for any class once you assign the CsvColumnAttribute to the properties you want to fill in. It will automatically validate the data, and you can handle failures however you want. In this code, all I do is skip invalid lines, though.
public static List<T> ParseCsvInfo<T>(List<String[]> split) where T : new()
{
// No template row, or only a template row but no data. Abort.
if (split.Count < 2)
return new List<T>();
String[] templateRow = split[0];
// Create a dictionary of rows and their index in the file data.
Dictionary<String, Int32> columnIndexing = new Dictionary<String, Int32>();
for (Int32 i = 0; i < templateRow.Length; i++)
{
// ToUpperInvariant is optional, of course. You could have case sensitive headers.
String colHeader = templateRow[i].Trim().ToUpperInvariant();
if (!columnIndexing.ContainsKey(colHeader))
columnIndexing.Add(colHeader, i);
}
// Prepare the arrays of property parse info. We set the length
// so the highest found column index exists in it.
Int32 numCols = columnIndexing.Values.Max() + 1;
// Actual property to fill in
PropertyInfo[] properties = new PropertyInfo[numCols];
// Regex to validate the string before parsing
Regex[] propValidators = new Regex[numCols];
// Type converters for automatic parsing
TypeConverter[] propconverters = new TypeConverter[numCols];
// go over the properties of the given type, see which ones have a
// CsvColumnAttribute, and put these in the list at their CSV index.
foreach (PropertyInfo p in typeof(T).GetProperties())
{
object[] attrs = p.GetCustomAttributes(true);
foreach (Object attr in attrs)
{
CsvColumnAttribute csvAttr = attr as CsvColumnAttribute;
if (csvAttr == null)
continue;
Int32 index;
if (!columnIndexing.TryGetValue(csvAttr.Name.ToUpperInvariant(), out index))
{
// If no valid column is found, and the regex for this property
// does not allow an empty value, then all lines are invalid.
if (!csvAttr.ValidationRegex.IsMatch(String.Empty))
return new List<T>();
// No valid column found: ignore this property.
break;
}
properties[index] = p;
propValidators[index] = csvAttr.ValidationRegex;
// Automatic type converter. This function could be enhanced by giving a
// list of custom converters as extra argument and checking those first.
propconverters[index] = TypeDescriptor.GetConverter(p.PropertyType);
break; // Only handle one CsvColumnAttribute per property.
}
}
List<T> objList = new List<T>();
// start from 1 since the first line is the template with the column names
for (Int32 i = 1; i < split.Count; i++)
{
Boolean abortLine = false;
String[] line = split[i];
// make new object of the given type
T obj = new T();
for (Int32 col = 0; col < properties.Length; col++)
{
// It is possible a line is not long enough to contain all columns.
String curVal = col < line.Length ? line[col] : String.Empty;
PropertyInfo prop = properties[col];
// this can be null if the column was not found but wasn't required.
if (prop == null)
continue;
// check validity. Abort buildup of this object if not valid.
Boolean valid = propValidators[col].IsMatch(curVal);
if (!valid)
{
// Add logging here? We have the line and column index.
abortLine = true;
break;
}
// Automated parsing. Always use nullable types for nullable properties.
Object value = propconverters[col].ConvertFromString(curVal);
prop.SetValue(obj, value, null);
}
if (!abortLine)
objList.Add(obj);
}
return objList;
}
To use on your CSV file, simply do
// the function using VB's TextFieldParser
List<String[]> splitData = SplitFile(datafile, new UTF8Encoding(false), ',');
// The above function, applied to the AddressInfo class
List<AddressInfo> addresses = ParseCsvInfo<AddressInfo>(splitData);
And that's it. Automatic parsing and validation, all through some added attributes on the class properties.
Note, if splitting the data in advance would give too much of a performance hit for large data, that's not really a problem; the TextFieldParser works from a Stream wrapped in a TextReader, so instead of giving a List<String[]> you can just give a stream and do the csv parsing on the fly inside the ParseCsvInfo function, simply reading per CSV line directly from the TextFieldParser.
I didn't do that here because the original use case for csv reading for which I wrote the reader to List<String[]> included automatic encoding detection, which required reading the whole file anyway.

I would suggest to using a CSV-library to read the file.
For example you can use LumenWorksCsvReader: https://www.nuget.org/packages/LumenWorksCsvReader
Your approach with an regex validation is actually ok.
For example, you could create a "Validation Dictionary" and check every CSV Value against the regex-expression.
Then you can build a function that can validate a CSV-File with such a "Validation Dictionary".
See here:
string lsInput = #"SITE_ID,HOUSE,STREET,CITY,STATE,ZIP,APARTMENT
44,545395,PORT ROYAL,CORPUS CHRISTI,TX,78418,2
44,608646,TEXAS AVE,ODESSA,TX,79762,
44,487460,EVERHART RD,CORPUS CHRISTI,TX,78413,
44,275543,EDWARD GARY,SAN MARCOS,TX,78666,4
44,136811,MAGNOLIA AVE,SAN ANTONIO,TX,78212";
Dictionary<string, string> loValidations = new Dictionary<string, string>();
loValidations.Add("SITE_ID", #"^\d+$"); //it can only be an integer and it is required.
//....
bool lbValid = true;
using (CsvReader loCsvReader = new CsvReader(new StringReader(lsInput), true, ','))
{
while (loCsvReader.ReadNextRecord())
{
foreach (var loValidationEntry in loValidations)
{
if (!Regex.IsMatch(loCsvReader[loValidationEntry.Key], loValidationEntry.Value))
{
lbValid = false;
break;
}
}
if (!lbValid)
break;
}
}
Console.WriteLine($"Valid: {lbValid}");

Here is another way to accomplish your needs using Cinchoo ETL - an open source file helper library.
First define a POCO class with DataAnnonations validation attributes as below
public class Site
{
[Required(ErrorMessage = "SiteID can't be null")]
public int SiteID { get; set; }
[Required]
public int House { get; set; }
[Required]
public string Street { get; set; }
[Required]
[RegularExpression("^[a-zA-Z][a-zA-Z ]*$")]
public string City { get; set; }
[Required(ErrorMessage = "State is required")]
[RegularExpression("^[A-Z][A-Z]$", ErrorMessage = "Incorrect zip code.")]
public string State { get; set; }
[Required]
[RegularExpression("^[0-9][0-9]*$")]
public string Zip { get; set; }
public int Apartment { get; set; }
}
then use this class with ChoCSVReader to load and check the validity of the file using Validate()/IsValid() method as below
using (var p = new ChoCSVReader<Site>("*** YOUR CSV FILE PATH ***")
.WithFirstLineHeader(true)
)
{
Exception ex;
Console.WriteLine("IsValid: " + p.IsValid(out ex));
}
Hope it helps.
Disclaimer: I'm the author of this library.

MongoDB C# 2.4 - Find distinct values of nested array

I have documents that look like below:
{
"_id" : ObjectId("58148f4337b1fc09b8c2de9k"),
"Price" : 69.99,
"Attributes" : [
{
"Name" : "Color",
"Value" : "Grey",
},
{
"Name" : "Gender",
"Value" : "Mens",
}
]
}
I am looking to get a distinct list of Attributes.Name (so if I just had the one document as above, I would get 'Color' and 'Gender' returned).
I was able to easily get what I needed through mongo shell (db.getCollection('myCollection').distinct('Attributes.Name'), but I'm really struggling with the C# driver (version 2.4). Can someone please help me translate the shell command to C#?
I tried something like below (and many variations). I'm new to the Mongo C# driver and am just feeling a bit lost. Any help would be appreciated.
var database = client.GetDatabase("mymongodb");
IMongoCollection<BsonDocument> collection = database.GetCollection<BsonDocument>("mycollection");
var filter = new BsonDocument();
var distinctAttributeNames = collection.Distinct<BsonDocument>("Attributes.Name", filter);
var tryAgain = collection.Distinct<BsonDocument>("{Attributes.Name}", filter);

There you go:
public class Foo
{
public ObjectId Id;
public double Price = 69.99;
public Attribute[] Attributes = {
new Attribute { Name = "Color", Value = "Grey" },
new Attribute { Name = "Gender", Value = "Men" }
};
}
public class Attribute
{
public string Name;
public string Value;
}
public class Program
{
static void Main(string[] args)
{
MongoClient client = new MongoClient();
var collection = client.GetDatabase("test").GetCollection<Foo>("test");
collection.InsertOne(new Foo());
var distinctItems = collection.Distinct(new StringFieldDefinition<Foo, string>("Attributes.Name"), FilterDefinition<Foo>.Empty).ToList();
foreach (var distinctItem in distinctItems)
{
Console.WriteLine(distinctItem);
// prints:
// Color
// Gender
}
Console.ReadLine();
}
}

Creating Relationships Between Nodes in Neo4j with Neo4jClient in C#

I'm working with Neo4j using the .Net Neo4jClient (http://hg.readify.net/neo4jclient/wiki/Home). In my code, nodes are airports and relationships are flights.
If I want to create nodes and relationships at the same time, I can do it with the following code:
Classes
public class Airport
{
public string iata { get; set; }
public string name { get; set; }
}
public class flys_toRelationship : Relationship, IRelationshipAllowingSourceNode<Airport>, IRelationshipAllowingTargetNode<Airport>
{
public static readonly string TypeKey = "flys_to";
// Assign Flight Properties
public string flightNumber { get; set; }
public flys_toRelationship(NodeReference targetNode)
: base(targetNode)
{ }
public override string RelationshipTypeKey
{
get { return TypeKey; }
}
}
Main
// Create a New Graph Object
var client = new GraphClient(new Uri("http://localhost:7474/db/data"));
client.Connect();
// Create New Nodes
var lax = client.Create(new Airport() { iata = "lax", name = "Los Angeles International Airport" });
var jfk = client.Create(new Airport() { iata = "jfk", name = "John F. Kennedy International Airport" });
var sfo = client.Create(new Airport() { iata = "sfo", name = "San Francisco International Airport" });
// Create New Relationships
client.CreateRelationship(lax, new flys_toRelationship(jfk) { flightNumber = "1" });
client.CreateRelationship(lax, new flys_toRelationship(sfo) { flightNumber = "2" });
client.CreateRelationship(sfo, new flys_toRelationship(jfk) { flightNumber = "3" });
The problem, however, is when I want to add relationships to already existing nodes. Say I have a graph consisting of only two nodes (airports), say SNA and EWR, and I would like to add a relationship (flight) from SNA to EWR. I try the following and it fails:
// Create a New Graph Object
var client = new GraphClient(new Uri("http://localhost:7474/db/data"));
client.Connect();
Node<Airport> departure = client.QueryIndex<Airport>("node_auto_index", IndexFor.Node, "iata:sna").First();
Node<Airport> arrival = client.QueryIndex<Airport>("node_auto_index", IndexFor.Node, "iata:ewr").First();
//Response.Write(departure.Data.iata); <-- this works fine, btw: it prints "sna"
// Create New Relationships
client.CreateRelationship(departure, new flys_toRelationship(arrival) { flightNumber = "4" });
The two errors I'm receiving are as follows:
1) Argument 1: cannot convert from 'Neo4jClient.Node' to 'Neo4jClient.NodeReference'
2) The type arguments for method 'Neo4jClient.GraphClient.CreateRelationship(Neo4jClient.NodeReference, TRelationship)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
The method the error is referring to is in the following class: http://hg.readify.net/neo4jclient/src/2c5446c17a65d6e5accd420a2dff0089799cbe16/Neo4jClient/GraphClient.cs?at=default
Any ideas?

In your CreateRelationship call you will need to use the node references, not the nodes, so:
client.CreateRelationship(departure.Reference, new flys_toRelationship(arrival.Reference) { flightNumber = "4" });
The reason why your initial creation code works and this didn't is because Create returns you a NodeReference<Airport> (the var is hiding that for you), and the QueryIndex returns a Node<Airport> instance instead.
Neo4jClient predominantly uses NodeReference's for the majority of its operations.
The second error you had was just related to not using the .Reference property as it couldn't determine the types, when you use the .Reference property that error will go away as well.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Item classification problem using ML.net with Naive Bayes - c#

Related

CsvHelper PrepareHeaderForMatch returns Context as one-item array

Add custom column to IDataView in ML.NET

Validate content of csv file c#

MongoDB C# 2.4 - Find distinct values of nested array

Creating Relationships Between Nodes in Neo4j with Neo4jClient in C#

Categories

Resources