I searched and could not find a good solution to replace all environment variables on linux. So below is my version (simple, reasonably fast). Enjoy.
#!/bin/bash
infile=$1
now=$(date +"%s")
tmpFile=~\$infile.$now.tmp
# Prepare the echo command file to replace environment variables
echo > $tmpFile
OLD_IFS="$IFS"
IFS=
while read line
do
echo "printf \"$line\n\"" >> $tmpFile;
done < $infile
IFS="$OLD_IFS"
source $tmpFile
rm $tmpFile
Wednesday, January 28, 2015
Tuesday, December 2, 2014
The normal / traditional resolution for reservoir sampling proves the 1/n probability for picking a random number. Though, the random is largely biased in the sense much less variations since the only time the number changed is the probability matches the count. Given the constraint with only one number allowed for caching, I added line to toggle the cached values (improved the toggling rate for cached value). This logic applied after deciding the return value. It's a separate logic: I got a new value, do I keep it, or toggle with cached one?
Try the run and compare the result.
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
public class ReservoirSampling {
public static void main(String[] args) {
List lst = new ArrayList();
List lstToggled = new ArrayList();
ReservoirSampling r = new ReservoirSampling();
ReservoirSampling rt = new ReservoirSampling();
for (int i = 1; i <= 100; i++) {
lst.add(r.run(i));
lstToggled.add(rt.runToggled(i));
}
System.out.println(lst);
System.out.println(lstToggled);
}
public ReservoirSampling() {
_rdm = new Random(System.currentTimeMillis());
}
public int run(int number) {
_count++;
if (_count == 1) {
_lastNumber = number;
return _lastNumber;
}
int n = _rdm.nextInt(_count + 1);
if (n == _count) {
_lastNumber = number;
}
return _lastNumber;
}
public int runToggled(int number) {
_count++;
if (_count == 1) {
_lastNumber = number;
return number;
}
int n = _rdm.nextInt(_count + 1);
if (n == _count) {
_lastNumber = number;
return number;
}
int result = _lastNumber;
if (n % 2 == 0) // toggled
_lastNumber = number;
return result;
}
private int _count;
private int _lastNumber; // previous random number cached
private Random _rdm;
}
Try the run and compare the result.
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
public class ReservoirSampling {
public static void main(String[] args) {
List
List
ReservoirSampling r = new ReservoirSampling();
ReservoirSampling rt = new ReservoirSampling();
for (int i = 1; i <= 100; i++) {
lst.add(r.run(i));
lstToggled.add(rt.runToggled(i));
}
System.out.println(lst);
System.out.println(lstToggled);
}
public ReservoirSampling() {
_rdm = new Random(System.currentTimeMillis());
}
public int run(int number) {
_count++;
if (_count == 1) {
_lastNumber = number;
return _lastNumber;
}
int n = _rdm.nextInt(_count + 1);
if (n == _count) {
_lastNumber = number;
}
return _lastNumber;
}
public int runToggled(int number) {
_count++;
if (_count == 1) {
_lastNumber = number;
return number;
}
int n = _rdm.nextInt(_count + 1);
if (n == _count) {
_lastNumber = number;
return number;
}
int result = _lastNumber;
if (n % 2 == 0) // toggled
_lastNumber = number;
return result;
}
private int _count;
private int _lastNumber; // previous random number cached
private Random _rdm;
}
Friday, March 26, 2010
.Net: How much slower when calling PropertyInfo.SetValue?
Link: internals from CLR team about this.
This question has been around in my mind for a while so I decide to give it a test.
The test is simple, directly set property value, verse using PropertyInfo to set the value. To make things more interesting, I also tested set value in Dictionary.
The result: 232 times (27166 / 117) slower!
In 1 million test runs,
class Employee
{
private string _firstName;
private string _lastName;
public string FirstName
{
get { return _firstName; }
set { _firstName = value; }
}
public string LastName
{
get { return _lastName; }
set { _lastName = value; }
}
[TestMethod]
public void PropertyPerformanceTest()
{
int maxLoop = 1000000;
long durationDirect = 0;
long durationProperty = 0;
long durationDictionary = 0;
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < e =" new" firstname = "Joan" lastname = "Smith" durationdirect =" sw.ElapsedMilliseconds;" pilastname =" typeof(Employee).GetProperty(" pifirstname =" typeof(Employee).GetProperty(" i =" 0;" e =" new" durationproperty =" sw.ElapsedMilliseconds;"> employeeValue = new Dictionary();
employeeValue.Add("LastName", null);
employeeValue.Add("FirstName", null);
sw.Start();
for (int i = 0; i < e =" new" durationdictionary =" sw.ElapsedMilliseconds;"> durationDictionary);
Assert.IsTrue(durationDictionary > durationDirect);
}
This question has been around in my mind for a while so I decide to give it a test.
The test is simple, directly set property value, verse using PropertyInfo to set the value. To make things more interesting, I also tested set value in Dictionary.
The result: 232 times (27166 / 117) slower!
In 1 million test runs,
- direct access value: 117 milliseconds
- PropertyInfo: 27166 milliseconds
- Dictionary: 266 milliseconds
class Employee
{
private string _firstName;
private string _lastName;
public string FirstName
{
get { return _firstName; }
set { _firstName = value; }
}
public string LastName
{
get { return _lastName; }
set { _lastName = value; }
}
[TestMethod]
public void PropertyPerformanceTest()
{
int maxLoop = 1000000;
long durationDirect = 0;
long durationProperty = 0;
long durationDictionary = 0;
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < e =" new" firstname = "Joan" lastname = "Smith" durationdirect =" sw.ElapsedMilliseconds;" pilastname =" typeof(Employee).GetProperty(" pifirstname =" typeof(Employee).GetProperty(" i =" 0;" e =" new" durationproperty =" sw.ElapsedMilliseconds;"> employeeValue = new Dictionary
employeeValue.Add("LastName", null);
employeeValue.Add("FirstName", null);
sw.Start();
for (int i = 0; i < e =" new" durationdictionary =" sw.ElapsedMilliseconds;"> durationDictionary);
Assert.IsTrue(durationDictionary > durationDirect);
}
Monday, March 1, 2010
Software Engineer Productivity
Is this a myth?
Enough said.
Wouldn't it be easier to treat software engineers like building bricks and add / remove / replace them as needed? So whenever you want to grow, buy more, whenever facing financial pressure, remove some?
Unfortunately, that's not the way system works.
It's still take quite some effort / experience to learn which part of the your internal system is core asset and which parts is less dependent on domain knowledge and engineering insight. The latter part could be outsourced much more easily, but the core asset is not.
The other myth. As far as I saw, software engineering is still like an art, much or less. Yes, every one after some training can work as an carpenter. Does every carpenter produce the same quality of work in a similar time?
Enough said.
Wouldn't it be easier to treat software engineers like building bricks and add / remove / replace them as needed? So whenever you want to grow, buy more, whenever facing financial pressure, remove some?
Unfortunately, that's not the way system works.
It's still take quite some effort / experience to learn which part of the your internal system is core asset and which parts is less dependent on domain knowledge and engineering insight. The latter part could be outsourced much more easily, but the core asset is not.
The other myth. As far as I saw, software engineering is still like an art, much or less. Yes, every one after some training can work as an carpenter. Does every carpenter produce the same quality of work in a similar time?
'Reuse' Is Not Usable
When it comes to code sharing, many times it is either not done at all, or only in a very minimal fashion, or heavily shared but super complicated -- meaning, shared once, but no one use that after they were shared.
Is there a balance? I have been always battling between sharing the code and keeping code simple. What's the principle or guidelines to follow?
When to share? When not? Or what are not sharable?
Below is the list I come up with:
1. Interface sharing. If the interface could be defined clear enough that any one (developer) comes and understand it right away, share it. If the shared interface became layer by layer, function plus function, specialized here and there for a particular implementation, don't share it.
2. Function sharing. Surprisingly, this is the best sharing technique. It provides two main benefits: a. Scalable. b. Focused logic (any function should be implemented within one page).
... to be filled
Rule of thump
Is there a balance? I have been always battling between sharing the code and keeping code simple. What's the principle or guidelines to follow?
When to share? When not? Or what are not sharable?
Below is the list I come up with:
1. Interface sharing. If the interface could be defined clear enough that any one (developer) comes and understand it right away, share it. If the shared interface became layer by layer, function plus function, specialized here and there for a particular implementation, don't share it.
2. Function sharing. Surprisingly, this is the best sharing technique. It provides two main benefits: a. Scalable. b. Focused logic (any function should be implemented within one page).
... to be filled
Rule of thump
- Copy / paste code is never a good idea.
- If it turns out you spend more time and make tremendous effort to reuse some simple stuff, you probably over shared the code
- Refer to rules above and try writing reusable code as much as possible.
Subscribe to:
Posts (Atom)