Intelligence involves several crucial characteristics: understanding the physical world, remembering and retrieving information, reasoning, and planning. These are essential traits of intelligent entities, whether human or animal. Unfortunately, LLMs can only simulate these abilities in a rudimentary way.

LLMs are trained on massive datasets, drawing from an enormous corpus of text—approximately 10^13 tokens. With each token typically comprising 2 bytes, the total training data amounts to around 20 terabytes. To put that in perspective, it would take me 170,000 years to read through all of it, assuming I spent 8 hours a day reading.

But when you consider the data processed by a human, it’s clear that this isn’t as impressive as it sounds. A developmental psychologist once told me that a 4-year-old child, who has been awake for roughly 16,000 hours, has likely processed around 10^15 bytes of visual information. This estimate comes from the fact that the optic nerve transmits approximately 20 megabytes per second.

In short, a 4-year-old processes 10^15 bytes of visual data, while an LLM is trained on just 2*10^13 bytes of text—data that would take 170,000 years to read. The key difference is that sensory input provides us with vastly more information than language alone. Most of our knowledge is acquired through observation, not just language.

The environment around us is far richer and more complex than anything that can be fully captured in words. Language, in many ways, is an approximate representation of our perceptions and mental models.

Consider this: we have LLMs that can pass the bar exam, a remarkable achievement by any measure. Yet, they can’t learn to drive in 20 hours like a typical 17-year-old. They can’t clear the dinner table and load a dishwasher after one demonstration, as a 10-year-old can. So, what are we missing? What kind of learning or reasoning architecture is preventing us from developing truly intelligent systems, like fully autonomous vehicles or domestic robots?

]]>Here’s a look at the population and murder rates for some US states:

State | Population | Murder Rate |
---|---|---|

Alabama | 4,779,736 | 5.7 |

Alaska | 710,231 | 5.6 |

Arizona | 6,392,017 | 4.7 |

Arkansas | 2,915,918 | 5.6 |

California | 37,253,956 | 4.4 |

Colorado | 5,029,196 | 2.8 |

Connecticut | 3,574,097 | 2.4 |

Delaware | 897,934 | 5.8 |

Using R’s built-in functions, we can compute estimates of variability for the state population data:

```
> sd(state[["Population"]])
[1] 6848235
> IQR(state[["Population"]])
[1] 4847308
> mad(state[["Population"]])
[1] 3849870
```

The standard deviation is almost twice as large as the MAD (in R, by default, the scale of the MAD is adjusted to be on the same scale as the mean). This is not surprising since the standard deviation is sensitive to outliers.

The reviews of restaurants, hotels, cafes, and so on that you read on social media sites like Yelp are prone to bias because the people submitting them are not randomly selected; rather, they themselves have taken the initiative to write. This leads to self-selection bias — the people motivated to write reviews may be those who had poor experiences, may have an association with the establishment, or may simply be a different type of person from those who do not write reviews.

Note:while self-selection samples can be unreliable indicators of the true state of affairs, they may be more reliable in simply comparing one establishment to a similar one; the same self-selection bias might apply to each.

In the era of big data, it is sometimes surprising that smaller is better. Time and effort spent on random sampling not only reduce bias, but also allow greater attention to data exploration and data quality. For example, missing data and outliers may contain useful information. It might be prohibitively expensive to track down missing values or evaluate outliers in millions of records, but doing so in a sample of several thousand records may be feasible. Data plotting and manual inspection bog down if there is too much data.

So when are massive amounts of data needed? The classic scenario for the value of big data is when the data is not only big but sparse as well.

Typically, a sample is drawn to measure something (using a sample statistic) or to model something (via a statistical or machine learning model). Since our estimate or model is derived from a sample, there is potential for error; different samples might yield different results. This leads us to be concerned with sampling variability. If we had a large amount of data, we could draw multiple samples and directly observe the distribution of a sample statistic. Usually, we calculate our estimate or model using the maximum data available, hence drawing additional samples from the population is often not feasible.

Note:It is crucial to distinguish between the data distribution (distribution of individual data points) and the sampling distribution (distribution of a sample statistic).

The standard error succinctly summarizes the variability within the sampling distribution of a statistic. It can be estimated using the sample’s standard deviation (s) and the sample size (n). As the sample size increases, the standard error decreases, following the square-root of (n) rule: reducing the standard error by a factor of 2 requires increasing the sample size by a factor of 4.

Note:Do not confuse standard deviation (which measures the variability of individual data points) with standard error (which measures the variability of a sample metric).

A straightforward and effective method for estimating the sampling distribution of a statistic or model parameters is to draw additional samples, with replacement, from the sample itself and recalculate the statistic or model for each resample. This method, known as the bootstrap, does not necessarily assume that the data or sample statistic follows a normal distribution.

Warning:The bootstrap does not make up for a small sample size; it neither creates new data nor fills gaps in existing data. It merely shows how additional samples would behave when drawn from a population like the original sample.

Frequency tables, histograms, boxplots, and standard errors are tools to understand potential errors in a sample estimate. Confidence intervals provide another method for this.

Confidence intervals come with a coverage level, expressed as a high percentage, such as 90% or 95%. A 90% confidence interval, for instance, encloses the central 90% of the bootstrap sampling distribution of a sample statistic.

Given a sample of size (n) and a sample statistic of interest, the algorithm for a bootstrap confidence interval is as follows:

- Draw a random sample of size (n) with replacement from the data (a resample).
- Record the statistic of interest for the resample.
- Repeat steps 1–2 multiple (R) times.
- For an (x\%) confidence interval, trim (\left[\left(1 - \frac{x}{100}\right) / 2\right]\%) of the (R) resample results from both ends of the distribution.
- The trim points are the endpoints of an (x\%) bootstrap confidence interval.

Note:While the desired question when obtaining a sample result is often “What is the probability that the true value lies within a certain interval?”, a confidence interval does not directly answer this. Instead, it answers a related probability question framed in terms of the sampling procedure and population.

The t-distribution is a normal-like distribution with thicker and longer tails. It is widely used in depicting distributions of sample statistics. Sample means typically follow a t-distribution, with the distribution shape becoming more normal as the sample size increases, leading to a family of t-distributions.

Many processes randomly produce events at a given overall rate—such as visitors arriving at a website or typos per 100 lines of code. From past data, we can estimate the average number of events per unit time or space, but we might also want to understand the variability from one unit to another. The Poisson distribution provides the distribution of events per unit time/space when sampling many such units, useful in queuing problems like determining capacity needs to confidently handle internet traffic over a specific period.

The key parameter in a Poisson distribution is (\lambda) (lambda), representing the mean number of events per specified interval. The variance of a Poisson distribution is also (\lambda).

**Lambda ((\lambda)):**The rate (per unit of time or space) at which events occur.**Poisson distribution:**The frequency distribution of the number of events in sampled units of time or space.**Exponential distribution:**The frequency distribution of the time or distance from one event to the next event.**Weibull distribution:**A generalized version of the exponential, where the event rate may shift over time.

In many applications, the event rate (\lambda) is either known or can be estimated from prior data. However, for rare events like aircraft engine failure, there might be insufficient data for precise estimates. Without data, estimating an event rate is challenging, but assumptions can be made: for instance, if no failures are observed over 20 hours, the failure rate is unlikely to be 1 per hour. Through simulation or probability calculations, hypothetical event rates can be assessed to estimate threshold values. When some data exists but isn’t enough for a reliable rate estimate, a goodness-of-fit test (e.g., Chi-Square Test) can be used to evaluate how well various rates fit the observed data.

]]>Are you struggling to secure an appointment with the Landesamt für Einwanderung (Ausländerbehörde) in Berlin for your visa application? You’re not alone. The process can be daunting, with limited slots and high demand making it challenging to book a convenient time.

**This is a short blog for non-developers. I know how you suffer, so I provide this simple guide to ease your burden.**

Fortunately, there’s a solution that I made: the Berlin Visa Appointment Macro. This Selenium-based bot is designed to streamline the appointment booking process, making it easier for anyone to navigate. Whether you’re a tech whiz or a novice, this tool can help you secure your appointment without the stress.

Before diving in, make sure you know whether you’re using Windows or macOS. This information will help guide you through the installation process.

**Step 1:**Install Visual Studio Code (VSCode)- Download VSCode for free from the official website.

**Step 2:**Install Git- Download Git for Windows from the official website.

**Step 3:**Clone and Run the Bot- Head over to the GitHub repository for the Berlin Visa Appointment Macro.
- Clone the repository to your local machine using the following command:
`git clone https://github.com/ccomkhj/berlin-visa-termin-macro.git`

- Follow the instructions in the README file to set up and run the bot.

**Step 1:**Install Visual Studio Code (VSCode)- Download VSCode for free from the official website.

**Step 2:**Install Git- Git comes pre-installed on macOS. You can verify its installation by opening Terminal and typing
`git --version`

. - If Git is not installed, you can download and install it from the official website.

- Git comes pre-installed on macOS. You can verify its installation by opening Terminal and typing
**Step 3:**Clone and Run the Bot- Head over to the GitHub repository for the Berlin Visa Appointment Macro.
- Clone the repository to your local machine using the following command:
`git clone https://github.com/ccomkhj/berlin-visa-termin-macro.git`

- Follow the instructions in the README file to set up and run the bot.

The Berlin Visa Appointment Macro automates the process of checking for available appointments on the Landesamt für Einwanderung’s booking page. Using Selenium, the bot continuously monitors the site for open slots. When an appointment becomes available, the bot notifies you via a Slack message and an audible alert, ensuring you never miss an opportunity to secure your appointment.

Booking a visa appointment in Berlin can be a frustrating experience, with appointments often scarce and in high demand. This bot levels the playing field by providing an open-source solution to automate the process. Whether you’re a seasoned programmer or a complete beginner, you can benefit from the simplicity and efficiency of this tool.

Don’t let the appointment booking process stress you out. Try the Berlin Visa Appointment Macro today and take the hassle out of securing your visa appointment in Berlin!

]]>However, since starting the entrepreneur’s journey and collaborating/managing teams, my job is not to truly focus on one topic but rather look through multiple aspects and support teams. In my initial weeks, I didn’t feel so great because I couldn’t follow my principles; limiting the aspects to consider and emphasizing at very few points.

I believe all life lessons can genuinely come after experiences.
Looking back over the last few weeks, I *indirectly* achieved significant achievements that will affect the dynamics of the agriculture industry. (Let’s see how it goes!)

The word *indirectly* comes because I achieved using less than 10% of my full working time.
Everyone will wonder how? the secret is to build the system to achieve the target.

Choose the (technically, economically) validated target set up the corresponding (real-life) loss function and figure out the (real-life) optimizer. Find the passionate and better talent who can handle the task well. (Ideally, significantly better, but it’s not always the right choice.) Set up the metric to track if the (business) model is getting close to the target. Try different methods to reduce the loss. There is no simple SGD, Adam in life, but I could learn at least sub-optimal methods. Life is the combination of sub-optimal. No one knows, proves the optimal life.

]]>Currently, I am an interviewer. I evaluate people and hire them. I have read thousands of CVs and interviewed hundreds of applilcants. It is again very challenging to find out a good talent as a company.

Why both parties feel it is so challenging to find each other? I believe that the world is the sequence of matching. It is sometimes lovers. A boy and a girl like each other, then they become a couple. It applies to an interviewer and an interviewee. They look for the right matching. It is all about timing and luck. However smart, charming, nice you are, if what the company looks for doesn’t match with your profile, it is not going well.

In every rejection letter, I want to include all my heart saying that you are great, but just we are not a match. Please don’t be disappointed from the rejection letter. You are amazing.

]]>A film on Netflix titled ‘Hunger’ explores this story through the perspective of young cook from the street. Since we all believe we are special in some sense, she took the offer from a scouter. I don’t blame anyone for seeking to be special. I somewhat believe that we see ourselves as the hero/heroine of our own story.

I am not going to elaborate all senarios of this movie, so I recommend watching this if you have 2 hours and not sure how to kill it. In short, movie tells the journey of being successful to professional success and fame. After years of striving, Aoy, who is the heroine, decides to return to her familiy. It’s somewhat of a cliche story if you are looking for a nouveau story. However, it was a pretty well-made movie that provokes thought.

One question that I have pondered is, “Must we decide only one direction? Can’t we achieve both at the same time if we are clever?” Since I have been working non-stop and missing invaluable opportunities to spend time with family, I might already choose the one-side. Despite this, I won’t forget the value of the other path.

]]>행복한 삶을 이루기 위해서는 자신이 원하는 것, 소유하고 싶은것, 그리고 삶에서 체험하고자 하는 것이 무엇인지 알아야 한다. 즉, 단순하고 편안하며 소소한 일상의 즐거움의 수준을 벗어난, 창조적인 능력을 가져야 한다.

긍정은 우연을 필연으로 만드는 강력한 에너지다. 평소 Critical point of view를 갖는다는 핑계로 항상 비관적 사고를 지녔던 나에게 울림을 주었다.

인생의 근본적인 세가지 질문은 아래와 같다.

- 과연 무엇이 나를 살게 하는가?
- 세상에 이로운 일은 하며, 직접적인 효과를 정량화 할 수 있는 것

- 어떻게 해야 불만족스러운 삶과 이별할 수 있는가?
- 하루를 돌이켰을때 후회할 짓을 하지 말자

- 어떻게 해야 내 삶의 기준을 찾을 수 있는가?
- 매일 한걸음씩, 옳은 방향으로 나아간다. 넘어지거나 뒤돌아 보지 않을 만큼 적정량을 꾸준히 한다.

고귀한 인간은 자신의 가치를 규정하는 것이 스스로라고 느끼기 때문에 타인의 인정은 필요로 하지 않는다. 나에게 해로운 것은 그 자체로 해롭다고 판단한다.

자존감이 높은 사람은 세상이 자기를 중심으로 돌아간다고 생각한다. 지나친 타인 배려 사상을 갖고 있는 나를 반성하게 한다.

깊은 고뇌를 하자. 인간은 얼마나 깊이 고뇌할 수 있는가에 따라 등급이 정해진다. 깊은 고뇌를 겪은 사람은 그 고뇌 덕분에 타인들 보다 더 많은 것을 알게 된다. 나 또한 기업을 일구는과정에서 정말 많은 고뇌를 했고, 여기서 얻은 깨달음은 다른 매체에서 읽은 것과는 차원이 다름을 느낀다. 문제를 해결하기 위해 매일 같이 수개월을 고뇌하다보며 그 누구도 해결치 못한 방법을 생각한 사례가 있다.

]]>At the moment when you are reading this article, you may have some thought and have certain emotion. We all have experienced some feeling and processed piles of thoughts.

The research of the brain is still covered by veil in most of them. Researchers barely understand which part of the brain having an electrical reaction happens in which situation. Also, emotions occur randomly but by a certain mechanism.

I know if I have too many stuffs to deal with, then I am prone to lose my temper and not satisfied what I’m working on. On the other hand, if I set up regular routines and could follow them, the simplicity gives me a joy.

However, our brain has evolved to seek for changes leading dopamine excess. All of different circles shape who I am now.

Even though I’m a sane person and work on a same vision, sometime I’m motivated by my circumstances, sometime I’m overwhelmed how many things should be done by me to achieve the goal.

Here’s the wisdom of the monks come. They focus on themselves and understand how they think. I’d rephrase, they understand what makes their thought.

In this wonderful world where I don’t know me and how it works, I will try to forget irrelevant stuff and focus on things meaningful to me.

]]>I got this idea from a paper that questioned why transformers aren’t the best choice for time-series datasets. I used to believe that XGBoost was the best for tabular datasets, but after doing lots of tests, I found that transformers work better for time-series problems. However, there are some conditions to this finding. One thing I can publicly say is: the dataset should have many features (I found that more than 20, but ideally 50; and all features should relate to or ideally cause the target).

When I use AI to solve real-world problems, it’s important to understand what causes what. Mixing domain knowledge and creativity is key, but finding what causes what really makes a difference.

]]>Drawing from my own experience having processed over 100 applications and conducted interviews with over 10 candidates, I have recognized the complexity of this process. Over time, it has become clear that the notion of the ‘perfect fit’, often sought after in recruitment, is frequently elusive and may not add the most value to the business. Even the most polished and promising candidates on paper may lack some equally essential qualities that can be uncovered only through insightful interaction and intuition.

The most effective hiring decisions do not merely focus on a candidate’s current qualifications and abilities. The real art of recruitment lies in identifying potential, diversity, and the propensity for growth. The principle of ‘hire for personality, train for skills’ resonates strongly in this context. Emphasizing these can result in hiring employees who, while perhaps not perfect fits initially, have the potential to bring value, drive innovation, and grow with the organization.

Organizations should understand that the art of hiring is about perceived fit rather than a perfect one. The right candidate is one who shares the organization’s vision, has the ability to adapt, and offers a unique perspective. This realization, garnered from my hands-on experience, has led me to view hiring as an opportunity to enrich the organization’s talent pool with diversity and potential.

Finally, the art of hiring concludes with an effective onboarding process, ensuring that new employees feel integrated and valued, laying the foundation for their productivity, satisfaction, and long-term retention. Therefore, successfully mastering the art of hiring involves being empathetic, intuitive, and dynamic at every stage of the recruitment process. This is key to not only hiring capable employees, but also to nurturing a workforce that aligns with and contributes significantly to realizing the company’s goals and culture.

]]>