Parsing Set Width Text File - c#

I am looking for a bit of guidance on how to parse CDR data from flat text files received from a PBX for reporting. The files are set width rather than using a delimiting character.
I have found something called text Field Parser but wonder if there is a better, simpler way.
http://csharphelper.com/blog/2017/02/use-a-textfieldparser-to-read-fixed-width-data-c/
I’ve added some examples from the vendor below so you can see the exact format and what the fields mean.
Example

The following is an example of an External SMDR record:
01/14 09:24 00:00:59 T201 003 P001 100 1011T 1405
Where,
 01/14 is the date the caller contacted your contact center 
09:24 is the time the call originated
 00:00:59 is the amount of time the agent spoke with the caller before transferring the call
 T201 is the number of the trunk that the caller dialed in to
 003 is the time to answer for the agent (not the time spent in queue)
P001 is the reporting number of the ACD path queue the call was queued to
 100 is the reporting number of the agent group
1011 is the ID of the agent who first answered the call
T is the transferred call identifier
 1405 is the ID of the agent whom the call was transferred to
This means that an outside caller dialed in to the contact center on Trunk 201, on January 14th at 9:24 AM. The call was queued to the ACD Path Queue 1 (shown as P001), queued to Agent Group 100, and answered by Agent 1011 after 3 seconds waiting in queue. The agent who answered the call talked to the customer for 59 seconds before transferring the call to Agent 1405.
 
Internal SMDR records
An Internal SMDR record is generated by the PBX when
1. A call is completed (i.e. when all parties involved in the call have hung up) between two devices on the PBX (extensions or agents), with no outside parties (trunks) involved in the call 
2. The call is an internal answered call only
 3. Calls to ACD queues report based on the dialable number of the queue, not the reporting number as found in the External SMDR records.
4. All parties in the call have their Class of Services set to enable SMDR Internal recording
5. The PBX has the Internal SMDR option enabled.


Example

The following is an example of an Internal SMDR record:
01/14 07:20 00:00:10 6979 002 6515 I 7015
Where,
 1/14 is the date the call was made
 07:20 is the time the call originated
 00:00:10 is the length of the call
6979 is the extension that the call was made from
 002 is the time to answer for the agent (not the time spent in queue)
6515 is the dialable number of the ACD queue the call was made to
I is the internal call identifier 
7015 is the ID of the agent who answered the call
This means that on January 14th at 7:20 AM, internal Extension 6979 dialed the ACD Queue P001 with dialable number 6515. The call was answered by Agent 7015 after 2 seconds of wait time. The two parties talked for 10 seconds. There was no external caller involved in this call
I want to be able to parse the CDR/ SMDR data above and put into a database so it can be reported on. I can quite easily do this with CSV data but just need some guidance on the best way to do this with set width data.

For fixed width parsing you'll want to use String.Substring(). Reference to the MS Docs.
In your example you would do something along the lines of (Note: I could be off by one, but you should get the general picture)
var line = "01/14 09:24 00:00:59 T201 003 P001 100 1011T 1405";
//If we think about the string as an array then:
//we start at index 0 and continue until we get to index 4.
var date = line.Substring(0,4) //This will be 01/14 as a string.
//We start at index 5 and continue until we get to index 10.
var time = line.Substring(5,10) //This shooould be 09:24 as a string.
You would continue in this fashion until you have all the data you'd want from the line.

Related

EWS exchange rooms view

I need to make an application which shows the current exchange meeting rooms and for each room whether the hour is free or busy. The user can give a daterange of max 5 days to see the result.
I have made a construction but it 2 slow to use as it takes up to 3 seconds to get all the information from only 3 meeting rooms (while in the future it will be more around 20).
This is how I work:
Authenticate through AutodiscoverUrl function: service.AutodiscoverUrl(email, password).
After been given a startdate and enddate with 5 days in it, I first get all the available meetingrooms with service.GetRooms("room#roomlist.com")
I iterate through the found meetingrooms and use the function service.GetUserAvailability(room,...) to get the calenderevents.
Then I have a class which tells me the hours of the day and I check the found calenderevents of the room to see whether an hour is busy or not.
Now I have my collection of rooms with calenderevents and the indication whether an hour is busy or not.
But is there another, faster way? As said this takes up to 2/3 seconds for only 3 rooms in a daterange of 5 days.
Are you calling the GetUserAvailability request for each room, as you iterate through, or batching the users together? The availability call can return info on multiple users (100 is the hard limit I recall). It's likely one big call will be more efficient than multiple single calls.

Generate closest teams based on employee schedules C#

I am given a csv of employee schedules with columns:
employee ID, first last name, sunday schedule, monday schedule, ... , saturday schedule
1 week schedule for each employee. I've attached a screenshot of a portion of the csv file. The total file has around 300 rows.
I need to generate teams of 15 based on the employees' schedules (locations don't matter) so that the employees on each team have the closest schedules to each other. Pseudocode of what I have tried:
parse csv file into array of schedules (my own struct definition)
match employees who have the same exact schedule into teams (creates ~5 full sized teams, 20 - 25 half filled teams, leaves ~50 schedules who don't match with anyone)
for i = 1 to 14, for each member of teams of size i, find the team with the closest schedule (as a whole) and add the member to that team. Once a team reaches size 15, mark them as "done".
This worked somewhat but definitely did not give me the best teams. My question is does anyone know a better way to do this? Pseudocode or just a general idea will help, thanks.
EDIT: Here is an example of the formula of comparison
The comparison is based on half hour blocks of difference between the agents schedules. Agent 25 has a score of 16 because he has a difference of 8 half hours with Agent 23 and 24. The team's total score is 32 based on everyone's scores added together.
Not all agents work 8 hour days, and many have different days off, which have the greatest effect on their "closeness" score. Also, a few agents have a different schedule on a certain day than their normal schedule. For example, one agent might work 7am - 3pm on mondays but work 8am - 4pm on tuesday - friday.
Unless you find a method that gets you an exact best answer, I would add a hill-climbing phase at the end that repeatedly checks to see if swapping any pair of agents between teams would improve things, and swaps them if this is the case, only stopping when it has rechecked every pair of agents and there are no more improvements to be made.
I would do this for two reasons:
1) Such hill-climbing finds reasonably good solutions surprisingly often.
2) People are good at finding improvements like this. If you produce a computer-generated schedule and people can find simple improvements (perhaps because they notice they are often scheduled at the same time as somebody from another team) then you're going to look silly.
Thinking about (2) another way to find local improvements would be to look for cases where a small number of people from different teams are scheduled at the same time and see if you can swap them all onto the same team.
Can't say for sure about the schedules, but in string algorithms you can find an edit distance calculation. The idea is to define number of operations you need to perform to get one string from another. For example, distance between kitten and sitting is 3, 2 for substitutions and 1 for deletion. I think that you can define a metric between two employees' schedule in similar way.
Now, after you have a distance function, you may start a clusterization. The k-means algorithm may be a good start for you, but it's main disadvantage is that the number of groups is fixed initially. But I think that you can easily adjust general logic for your needs. After that, you may try some additional ways to cluster your data, but you really should start with your distance function, and then simply optimize it on your employee records.

Getting office365 task properties

I am working on Office 365 tasks using ews api in c#. I have successfully accessed most properties of a task but fed up with accessing two properties and their values 1-Repetition, 2-Unit of measure for actualwork and totalwork (can access ActualWork and Total work but not their unit of measures eg. if actualwork is set in hours or minutes? how to get this?). Can anyone help me how to get these properties?
enter image description here
enter image description here
The unit of measurement is always minutes as documented in https://msdn.microsoft.com/en-us/library/ee219615(v=exchg.80).aspx . Those UI elements are calculated based on the value in minutes. Eg you can easily test this in the UI try setting it to 480 minutes and you should see the UI change to 1 day once you save it and reopen it. But you need to take into account it working hours (where 480 minutes equal one day and 2400 minutes equal one week (8 hours in a work day and 5 work days in a work week)).

Approach to solve delimited scheduling

I'm facing a problem and Im having problems to decide/figure-out an approach to solve it. The problem is the following:
Given N phone calls to be made, schedule in a way that the maximum of them be made.
Know Info:
Number of phone calls pending
Number callers (people who will talk on the phone)
Type of phone call (Reminder, billing, negotiation, etc...)
Estimate duration of phone call type (reminder:1min, billing:3min, negotiation:15min, etc...)
Number of phone calls pending
Ideal date for a given call
"Minimum" date of the a given call (can't happen before...)
"Maximum" date of the a given call (can't happen after...)
A day only have 8 hours
Rules:
Phone calls cannot be made before the "Minimum" or after the "Maximum" date
Reminder call placed award 1 point, reminder call missed -2 points
Billing call placed award 6 points, billing call missed -9 points
Negotiation call placed award 20 points, Negotiation call missed -25 points
A phone calls to John must be placed by the first person to ever call him. Notice that it does not HAVE TO, but, that call will earn extra points if you do...
I know a little about A.I. and I can recognize this a problem that fits the class, but i just dont know which approach to take... should i use neural networks? Graph search?
PS: this is not a academic question. This a real world problem that im facing.
PS2: Pointing system is still being created... the points here sampled are not the real ones...
PS3: The resulting algol can be executed several times (batch job style) or it can be resolved online depending on the performance...
PS4: My contract states that I will charge the client based on: (amount of calls I place) + (ratio * the duration of the call), but theres a clause about quality of service, and only placing reminders calls is not good for me, because even when reminded, people still forget to attend their appointments... which reduces the "quality" of the service I provide... i dont know yet the exact numbers
This does not seem like a problem for AI.
If it were me I would create a set of rules, ordered by priority. Then start filling in the caller's schedule.
Mabey one of the rules is to assign the shortest duration call types first (to satisfy the "maximum number of calls made" criteria).
This is sounding more and more like a knapsack problem, where you would substitute in call duration and call points for weight and price.
This is just a very basic answer, but you could try to "brute force" an optimum solution:
Use the Combinatorics library (it's in NuGet too) to generate every permutation of calls for a given person to make in a given time period (looking one week into the future, for instance).
For each permutation, group the calls into 8-hour chunks by estimated duration, and assign a date to them.
Iterate through the chunks - if you get to a call too early, discard that permutation. Otherwise add or subtract points based on whether the call was made before the end date. Store the total score as the score for that permutation.
Choose the permutation with the highest score.

How to count the data sent or received by my PC (all processes/programs)?

I need to count the amount (in B/kB/MB/whatever) of data sent and received by my PC, by every running program/process.
Let's say I click "Start counting" and I get the sum of everything sent/received by my browser, FTP client, system actualizations etc. etc. from that moment till I choose "Stop".
To make it simpler, I want to count data transferred via TCP only - if it matters.
For now, I got the combo list of NICs in the PC (based on the comment in the link below).
I tried to change the code given here but I failed, getting strange out-of-nowhere values in dataSent/dataReceived.
I also read the answer at the question 442409 but as I can see it is about the data sent/received by the same program, which doesn't fit my requirements.
Perfmon should have counters for this type of thing that you want to do, so look there first.
Alright, I think I've found the solution, but maybe someone will suggest something better...
I made the timer (tested it with 10ms interval), which gets the "Bytes Received/sec" PerformanceCounter value and adds it to a global "temporary" variable and also increments the sum counter (if there is any lag). Then I made second timer with 1s interval, which gets the sum of values (from temporary sum), divides it by the counter and adds to the overall amount (also global). Then it resets the temporary sum and the counter.
I'm just not sure if it is right method, because I don't know, how the variables of "Bytes Received/sec" PerformanceCounter are varying during the one second. Maybe I should make some kind of histograph and get the average value?
For now, downloading 8.6MB file gave me 9.2MB overall amount - is it possible the other processes would generate that amount of net activity in less than 20 seconds?

Categories

Resources