New pages

From CS Wiki
New pages
Hide registered users | Hide bots | Show redirects
(newest | oldest) View (newer 50 | ) (20 | 50 | 100 | 250 | 500)
  • 21:48, 29 November 2024PHP-FPM Dynamic Process Management (hist | edit) ‎[3,693 bytes]Fortify (talk | contribs) (Created page with "'''PHP-FPM Dynamic Process Management''' refers to one of the modes available in PHP-FPM (FastCGI Process Manager) to manage worker processes. In this mode, the number of worker processes dynamically adjusts based on server load, ensuring efficient use of system resources while maintaining the ability to handle varying traffic levels. ==Overview== PHP-FPM is a robust process manager for PHP, and its dynamic mode is designed to strike a balance between performance and res...") Tag: Visual edit
  • 21:45, 29 November 2024PHP-FPM pm.max spare servers (hist | edit) ‎[2,705 bytes]Fortify (talk | contribs) (Created page with "'''pm.max_spare_servers''' is a configuration directive in PHP-FPM (FastCGI Process Manager) that specifies the maximum number of idle (spare) worker processes to maintain in the pool. It is used when the process manager (pm) is set to '''dynamic'''. This directive ensures that server resources are not wasted by limiting the number of idle worker processes. ==Overview== In dynamic process management mode, PHP-FPM adjusts the number of worker processes based on the server...") Tag: Visual edit
  • 21:44, 29 November 2024PHP-FPM pm.min spare servers (hist | edit) ‎[2,629 bytes]Fortify (talk | contribs) (Created page with "'''pm.min_spare_servers''' is a configuration directive in PHP-FPM (FastCGI Process Manager) used to specify the minimum number of idle (spare) worker processes that should be maintained in the pool. It is applicable when the process manager (pm) is set to '''dynamic'''. This directive ensures that there are always enough idle workers available to handle incoming requests without unnecessary delays. ==Overview== When PHP-FPM is configured to use the '''dynamic''' process...") Tag: Visual edit
  • 21:28, 29 November 2024Time Series Data (hist | edit) ‎[3,841 bytes]Fortify (talk | contribs) (Created page with "'''Time Series Data''' refers to a sequence of data points collected or recorded at successive, evenly spaced points in time. This type of data is used to track changes over time and is a critical component in various fields like finance, economics, environmental science, and machine learning. ==Overview== Time series data captures how a variable evolves over time. The primary characteristic of time series data is its temporal ordering, meaning that the order of the obse...") Tag: Visual edit
  • 21:17, 29 November 2024Lagged Time Series (hist | edit) ‎[3,014 bytes]Fortify (talk | contribs) (Created page with "'''Lagged Time Series''' refers to a transformation of time series data where previous values (lags) of the series are used to predict or understand future values. Lagged variables are essential in time series analysis and forecasting, as they help capture the temporal dependencies and autocorrelation within the data. ==Overview== In a lagged time series, the value of a variable at a specific time point is related to its values at earlier time points. This is particularl...") Tag: Visual edit
  • 19:56, 29 November 2024Shapley Value (hist | edit) ‎[3,378 bytes]Fortify (talk | contribs) (Created page with "'''Shapley Value''' is a concept from cooperative game theory that provides a fair distribution of a total "payout" among players based on their individual contributions to the group. It is widely used in economics, decision-making, and machine learning for feature attribution and model interpretability. The Shapley Value ensures that each participant's contribution is valued in a mathematically fair and consistent manner....") Tag: Visual edit
  • 17:38, 29 November 2024SHAP Analysis (hist | edit) ‎[2,985 bytes]Fortify (talk | contribs) (Created page with "'''SHAP Analysis''' (SHapley Additive exPlanations) is a machine learning interpretability technique based on cooperative game theory. It is used to explain the predictions of complex machine learning models by attributing the contribution of each feature to the model's output. SHAP values provide a consistent and mathematically sound way to interpret individual predictions and global feature importance. ==Overview== SHAP values are derived from Shapley values, a concept...") Tag: Visual edit
  • 17:14, 29 November 2024Beeswarm Plot (hist | edit) ‎[3,032 bytes]Fortify (talk | contribs) (Created page with "'''Beeswarm Plot''' is a data visualization technique used to display individual data points along a single axis, often overlaid with a distribution representation. It helps to visualize the spread, density, and clustering of data points while avoiding overlap. Beeswarm plots are commonly used in exploratory data analysis to understand data distributions and outliers. ==Overview== thumb|Beeswarm Plot Beeswarm plots arrange individual data point...") Tag: Visual edit
  • 16:58, 29 November 2024Waterfall Plot (hist | edit) ‎[3,421 bytes]162.158.63.105 (talk) (Created page with "'''Waterfall Plot''' is a data visualization technique used to represent sequential changes in the value of a variable, often to show the cumulative effect of positive and negative contributions over time or across categories. This plot is commonly used in fields like financial analysis, engineering, and data science to break down the components of a total change. == Overview == A waterfall plot displays data as a series of bars, where: * Each bar represents an individu...") Tag: Visual edit: Switched
  • 13:14, 29 November 2024DNS PROBE FINISHED NXDOMAIN (hist | edit) ‎[3,020 bytes]Fortify (talk | contribs) (Created page with "'''DNS_PROBE_FINISHED_NXDOMAIN''' is an error message displayed by web browsers, such as Google Chrome, indicating that the Domain Name System (DNS) could not resolve the domain name into an IP address. This error means that the requested domain does not exist or the DNS query failed. == Overview == When a user tries to access a website, the DNS translates the domain name (e.g., example.com) into an IP address. The '''NXDOMAIN''' in the error message stands for "Non-Exi...") Tag: Visual edit: Switched
  • 00:39, 29 November 2024Extrapolation (Data Science) (hist | edit) ‎[2,797 bytes]Fortify (talk | contribs) (Created page with "'''Extrapolation''' is a data science technique used to estimate or predict values beyond the range of observed data. It involves extending a known trend, pattern, or relationship to predict outcomes for new, unobserved data points. While powerful, extrapolation can introduce significant errors if the assumptions about the data's behavior outside the observed range are incorrect. ==Overview== Extrapolation assumes that trends or relationships in the known data set remain...") Tag: Visual edit
  • 00:32, 29 November 2024Confounder (Data Science) (hist | edit) ‎[3,025 bytes]Fortify (talk | contribs) (Created page with "'''Confounder''' is a variable that influences both the dependent variable and one or more independent variables, potentially leading to a spurious association or bias in the analysis. In data science, identifying and addressing confounders is crucial to ensure the validity of causal inferences and statistical models. ==Overview== Confounders introduce bias by creating a false relationship between the variables of interest. If not properly controlled, they can lead to in...") Tag: Visual edit
  • 16:16, 24 November 2024Routing Table (hist | edit) ‎[3,574 bytes]Prairie (talk | contribs) (Created page with "'''Routing Table''' is a data structure used by routers to store and manage route information. It determines the best path for forwarding packets to their destination. Each entry in a routing table corresponds to a specific destination network and includes information about the next hop, metrics, and routing protocol. ==Overview== The routing table is a fundamental component of the control plane in a network. It is built and maintained by routing protocols or through man...") Tag: Visual edit
  • 16:10, 24 November 2024Longest Prefix Matching (hist | edit) ‎[3,042 bytes]Prairie (talk | contribs) (Created page with "'''Longest Prefix Matching (LPM)''' is an algorithm used in networking to determine the best matching route for a given IP address. It is primarily employed in routing tables and forwarding tables to decide the next hop for packet forwarding. The "longest prefix" refers to the route entry with the most specific (longest) subnet mask that matches the destination IP. ==Overview== Longest Prefix Matching ensures that packets are routed along the most specific path available...") Tag: Visual edit
  • 16:08, 24 November 2024Forwarding Table (hist | edit) ‎[3,098 bytes]Prairie (talk | contribs) (Created page with "'''Forwarding Table''' is a data structure used by network devices, such as routers and switches, to determine the next hop or output interface for forwarding packets. It plays a crucial role in the data plane, enabling efficient and accurate packet forwarding based on precomputed rules. ==Overview== The forwarding table maps packet header information, such as destination IP addresses or MAC addresses, to specific output ports or next-hop devices. It is typically built u...") Tag: Visual edit
  • 15:58, 24 November 2024Network Plane (hist | edit) ‎[4,253 bytes]Prairie (talk | contribs) (Created page with "'''Network Plane''' refers to the functional layers of a network architecture, each responsible for specific tasks in the operation and management of a network. These planes are essential to understanding and designing modern networks, particularly in the context of Software-Defined Networking (SDN) and traditional networking models. ==Overview== A network plane is a conceptual division that separates the functions of networking into distinct areas, enabling modularity a...") Tag: Visual edit
  • 22:41, 21 November 2024금 시세 (hist | edit) ‎[4,216 bytes]Betripping (talk | contribs) (Created page with "'''금 시세'''는 금의 현재 시장 가격을 의미하며, 국제 및 국내 시장에서 실시간으로 변동한다. 금 시세는 투자, 제조, 장신구 제작 및 중앙은행의 정책 결정에 중요한 기준으로 사용되며, 다양한 경제적 요인에 의해 영향을 받는다. 보통 한국에서는 "금값"이라고 한 단어로 쓰기도 한다. ==금 시세의 결정 요인== *'''국제 금 시세''' **금은 국제 시장에서 미국 달...") Tag: Visual edit
  • 20:48, 21 November 2024OpenFlow Controller (hist | edit) ‎[4,215 bytes]162.158.158.59 (talk) (Created page with "'''OpenFlow Controller''' is a central component in Software-Defined Networking (SDN) architectures. It provides centralized control and management of OpenFlow-enabled network devices, such as switches and routers, using the OpenFlow protocol. The controller enables dynamic and programmable network management by separating the control plane from the data plane. ==Overview== An OpenFlow Controller serves as the "brain" of an SDN network, making decisions about how packets...") Tag: Visual edit
  • 20:04, 21 November 2024멀티캐스트 (hist | edit) ‎[3,773 bytes]172.70.114.101 (talk) (Created page with "'''Multicast''' '''멀티캐스트'''는 네트워크 통신 방식 중 하나로, 한 송신자가 특정 그룹에 속한 여러 수신자에게 데이터를 전송하는 방법을 의미한다. 이는 유니캐스트(단일 수신자)와 브로드캐스트(모든 수신자) 사이의 방식으로, 효율적이고 대역폭을 절약하는 데이터 전송 기술이다. ==개념== 멀티캐스트는 송신자가 데이터를 특정 그룹에 속한 장치들만 수신...") Tag: Visual edit
  • 19:57, 21 November 2024브로드캐스트 (hist | edit) ‎[6,273 bytes]162.158.62.213 (talk) (Created page with "'''Broadcast''' 브로드캐스트는 컴퓨터 네트워크에서 하나의 장치가 동일 네트워크 상의 모든 장치에게 데이터를 전송하는 통신 방식이다. 브로드캐스트는 LAN(Local Area Network)에서 주로 사용되며, 네트워크 운영에 있어 중요한 역할을 한다. ==주요 개념== '''브로드캐스트 주소''' *네트워크 내 모든 장치가 패킷을 수신하도록 설정된 IP 주소. IPv4 네트워크에서는 `2...") Tag: Visual edit
  • 16:44, 20 November 2024Iron Curtain (hist | edit) ‎[2,385 bytes]Deposition (talk | contribs) (Created page with "The '''Iron Curtain''' refers to the political, military, and ideological barrier erected by the Soviet Union after World War II to separate itself and its satellite states in Eastern Europe from the Western world. The term symbolizes the division between communist and non-communist countries during the Cold War. thumb|Iron Curtain === Origins of the Term === The phrase "Iron Curtain" became popular after it was used by British Prime Minister '...") Tag: Visual edit
  • 14:04, 20 November 2024Finite State Machine (hist | edit) ‎[3,023 bytes]Deposition (talk | contribs) (Created page with "A '''Finite State Machine''' (FSM) is a computational model used to design and analyze the behavior of systems. FSMs are characterized by a finite number of states, transitions between those states, and actions that result from those transitions. ==Overview== A finite state machine consists of: *A finite set of states. *A finite set of inputs. *A transition function that determines the next state for a given state and input. *An initial state. *(Optionally) a set of fina...") Tag: Visual edit
  • 23:44, 13 November 2024TCP 혼잡 제어 (hist | edit) ‎[3,175 bytes]Prairie (talk | contribs) (Created page with "'''TCP 혼잡 제어'''(TCP Congestion Control)는 TCP(Transmission Control Protocol)에서 네트워크 혼잡을 관리하고 데이터 손실을 줄이기 위한 메커니즘이다. 혼잡 제어는 네트워크 상태에 따라 전송 속도를 동적으로 조절하여 네트워크 자원의 효율성을 높이고, 혼잡으로 인한 성능 저하를 방지하는 데 중요한 역할을 한다. ==혼잡 제어 알고리즘의 주요 단계== '''혼잡 회피'''(Con...") Tag: Visual edit
  • 21:39, 13 November 2024TCP 시퀀스 번호 (hist | edit) ‎[2,760 bytes]Prairie (talk | contribs) (Created page with "'''TCP 시퀀스 번호'''(TCP Sequence Number)는 TCP(Transmission Control Protocol)에서 데이터 패킷의 순서를 추적하고, 전송 중 손실된 데이터의 재전송 및 올바른 데이터 조립을 보장하기 위해 사용하는 숫자이다. 시퀀스 번호는 TCP 연결에서 매우 중요한 역할을 하며, 송신 측에서 전송하는 각 바이트에 고유한 번호를 할당한다. 수신 측에서는 이를 기반으로 패킷이 올바른...") Tag: Visual edit
  • 15:30, 13 November 2024데이터베이스 후보 키 (hist | edit) ‎[2,823 bytes]핵톤 (talk | contribs) (Created page with "'''후보 키'''(Candidate Key)는 데이터베이스 테이블에서 각 행을 고유하게 식별할 수 있는 속성 또는 속성들의 집합을 의미한다. 후보 키는 테이블 내의 모든 행을 유일하게 구분할 수 있는 최소한의 속성 집합으로, 기본 키(primary key)로 선택될 수 있는 후보가 된다. ==후보 키의 조건== 후보 키가 되기 위해서는 다음 조건을 만족해야 한다. *'''유일성'''(Uniqueness): 후...") Tag: Visual edit
  • 15:28, 13 November 2024데이터베이스 보이스-코드 정규형 (hist | edit) ‎[3,302 bytes]핵톤 (talk | contribs) (Created page with "'''보이스-코드 정규형'''(Boyce-Codd Normal Form, BCNF)은 데이터베이스 정규화의 네 번째 단계로, 제3정규형(3NF)을 강화한 형태이다. 보이스-코드 정규형은 제3정규형을 만족하면서, 모든 결정자가 후보 키가 되도록 요구하여 데이터베이스의 설계를 더욱 엄격하게 한다. ==보이스-코드 정규형의 조건== 보이스-코드 정규형을 만족하기 위해서는 다음 조건을 충족해야...") Tag: Visual edit
  • 14:55, 13 November 2024데이터베이스 제3정규형 (hist | edit) ‎[3,346 bytes]핵톤 (talk | contribs) (Created page with "'''Third Normal Form, 3NF''' '''제3정규형'''은 데이터베이스 정규화의 세 번째 단계로, 제2정규형(2NF)을 만족하면서 테이블 내에서 이행적 종속성(transitive dependency)을 제거하는 것을 목표로 한다. 제3정규형은 기본 키에만 종속하도록 설계하여 데이터 중복을 줄이고 데이터 무결성을 더욱 강화한다. ==제3정규형의 조건== 제3정규형을 만족하기 위해...") Tag: Visual edit
  • 14:48, 13 November 2024부분 함수 종속성 (hist | edit) ‎[2,763 bytes]핵톤 (talk | contribs) (Created page with "'''Partial Functional Dependency''' '''부분 함수 종속성'''은 데이터베이스 정규화 과정에서, 합성 키(composite key)를 가진 릴레이션에서 기본 키의 일부에만 종속하는 속성이 존재하는 경우를 의미한다. 부분 함수 종속성은 데이터 중복과 비효율적인 데이터 구조를 초래할 수 있으며, 제2정규형(2NF)에서는 이를 제거하는 것이 목표이다. ==개요== 부분 함수 종속성은 주...") Tag: Visual edit
  • 14:46, 13 November 2024데이터베이스 제2정규형 (hist | edit) ‎[3,025 bytes]핵톤 (talk | contribs) (Created page with "'''Second Normal Form, 2NF''' '''제2정규형'''은 데이터베이스 정규화의 두 번째 단계로, 제1정규형(1NF)을 만족하면서 테이블 내에서 '''부분 함수 종속성(�Partial Functional Dependency)'''을 제거하는 것을 목표로 한다. 제2정규형은 기본 키의 일부에만 종속하는 속성을 제거하여 데이터 중복을 줄이고 데이터 무결성을 향상시킨다. ==제2정규형...") Tag: Visual edit
  • 14:35, 13 November 2024데이터베이스 제1정규형 (hist | edit) ‎[2,638 bytes]핵톤 (talk | contribs) (Created page with "'''제1정규형'''(First Normal Form, 1NF)은 데이터베이스 정규화의 첫 번째 단계로, 테이블의 모든 속성이 원자값(atomic value)을 가지도록 설계하는 것을 의미한다. 즉, 테이블 내의 각 열(속성)은 더 이상 나눌 수 없는 단일 값을 가져야 한다. 이를 통해 데이터의 중복을 줄이고 데이터 무결성을 강화할 수 있다. ==제1정규형의 조건== 제1정규형을 만족하기 위해서는 다...") Tag: Visual edit
  • 11:39, 13 November 2024Apache AllowOverride (hist | edit) ‎[4,080 bytes]Prairie (talk | contribs) (Created page with "The '''AllowOverride''' directive in Apache HTTP Server is used to specify which types of directives can be overridden by `.htaccess` files in specific directories. By default, Apache uses configuration files like `httpd.conf` or `apache2.conf` for global settings, but `AllowOverride` enables web administrators to override these settings at the directory level using `.htaccess` files. This is particularly useful for shared hosting environments where users may need to man...") Tag: Visual edit
  • 11:37, 13 November 2024Apache Require (hist | edit) ‎[3,845 bytes]Prairie (talk | contribs) (Created page with "The '''Require''' directive in Apache HTTP Server is used to control access to resources by specifying conditions that clients must meet to be granted access. The `Require` directive is commonly used for user authentication, IP-based access control, and group-based restrictions, enhancing the security and flexibility of web applications. ==Purpose of Require== The '''Require''' directive enables fine-grained access control by setting specific conditions. This can be usef...") Tag: Visual edit
  • 11:34, 13 November 2024Apache AddType (hist | edit) ‎[3,243 bytes]Prairie (talk | contribs) (Created page with "The '''AddType''' directive in Apache HTTP Server is used to define or change the MIME (Multipurpose Internet Mail Extensions) type for specific file extensions. MIME types tell the browser how to handle files received from the server, such as rendering HTML, displaying images, or executing scripts. Setting the correct MIME type is essential for the server to communicate file handling instructions to the client. ==Purpose of AddType== The '''AddType''' directive helps in...") Tag: Visual edit
  • 11:12, 13 November 2024Apache Options MultiViews (hist | edit) ‎[3,061 bytes]Prairie (talk | contribs) (Created page with "The '''Options Multiviews''' directive in Apache HTTP Server allows content negotiation by enabling the server to automatically select the best-matching file based on the client’s request. When enabled, the `Multiviews` option allows Apache to match and serve files with various extensions without requiring the full file name in the URL, improving flexibility in file handling and localization. ==Purpose of Options Multiviews== The '''Options Multiviews''' directive help...") Tag: Visual edit
  • 11:11, 13 November 2024Apache Options Indexes (hist | edit) ‎[2,289 bytes]Prairie (talk | contribs) (Created page with "The '''Options Indexes''' directive in Apache HTTP Server configures the display of directory listings. When enabled, this option allows users to see a list of files in a directory if no default file (like `index.html` or `index.php`) is present. This can be useful for browsing available files, but it also presents security considerations, as it can expose sensitive information. ==Purpose of Options Indexes== The '''Options Indexes''' directive controls whether Apache wi...") Tag: Visual edit
  • 22:15, 12 November 2024TCP 왕복 시간 (hist | edit) ‎[3,888 bytes]Prairie (talk | contribs) (Created page with "'''TCP 왕복시간'''(TCP RTT: Round Trip Time)는 TCP 연결에서 패킷이 송신된 후 수신자로부터 응답(ACK)을 받는 데 걸리는 시간을 의미한다. RTT는 네트워크 지연을 측정하는 중요한 요소로, TCP가 최적의 데이터 전송 속도를 유지하고, 패킷 손실을 최소화하는 데 필수적인 정보이다. TCP RTT는 네트워크 품질, 대역폭, 지연 요소에 따라 달라지며, 이를 통해 네트워크 혼잡을...") Tag: Visual edit
  • 22:02, 12 November 2024TIME WAIT 상태 (hist | edit) ‎[3,552 bytes]Prairie (talk | contribs) (Created page with "'''Time Wait 상태'''는 TCP 연결이 종료된 후, 해당 연결의 포트 번호가 재사용되기 전까지 일정 시간 동안 유지되는 상태를 의미한다. 이 상태는 TCP/IP 프로토콜에서의 연결 종료 과정을 안전하게 마무리하고, 패킷 재전송으로 인한 문제를 방지하기 위해 사용된다. ==개요== TCP 연결은 송신자와 수신자가 모두 연결을 종료하는 과정을 거치며, 이를 통해 원활하고...") Tag: Visual edit
  • 21:26, 12 November 2024TIME WAIT state (hist | edit) ‎[2,829 bytes]Prairie (talk | contribs) (Created page with "The '''TIME_WAIT state''' is a crucial phase in the Transmission Control Protocol (TCP) that occurs after a connection has been terminated. This state ensures that all data packets have been properly transmitted and acknowledged, preventing potential issues from delayed packets in the network. ==Purpose of TIME_WAIT== 1. '''Preventing Delayed Packet Issues''': After a connection closes, packets that were delayed in the network might still arrive. The TIME_WAIT state ensu...") Tag: Visual edit
  • 19:10, 4 November 2024Missing Data (hist | edit) ‎[5,902 bytes]핵톤 (talk | contribs) (Created page with "Missing Data refers to the absence of values in a dataset, which can occur due to various reasons such as data entry errors, equipment malfunctions, or privacy concerns. Handling missing data is crucial in data science and machine learning, as it can impact the quality, accuracy, and interpretability of models. Properly addressing missing values ensures that analyses are more reliable and that models generalize well to new data. ==Types of Missing Data== There are three...") Tag: Visual edit
  • 19:04, 4 November 2024Normalization (Data Science) (hist | edit) ‎[5,085 bytes]핵톤 (talk | contribs) (Created page with "Normalization in data science is a preprocessing technique used to adjust the values of numerical features to a common scale, typically between 0 and 1 or -1 and 1. Normalization ensures that features with different ranges contribute equally to the model, improving training stability and model performance. It is especially important in machine learning algorithms that rely on distance calculations, such as k-nearest neighbors (kNN) and clustering. ==Importance of Normali...") Tag: Visual edit
  • 17:49, 4 November 2024Bias-Variance Trade-Off (hist | edit) ‎[6,277 bytes]핵톤 (talk | contribs) (Created page with "The Bias-Variance Trade-Off is a fundamental concept in machine learning that describes the balance between two sources of error that affect model performance: bias and variance. The goal is to achieve a balance between bias and variance that minimizes the model’s total error, enabling it to generalize well to new, unseen data. ==Understanding Bias and Variance== *'''Bias''': Refers to the error introduced by approximating a complex real-world problem with a simplified...") Tag: Visual edit
  • 17:10, 4 November 2024Decision Tree Prunning (hist | edit) ‎[4,593 bytes]핵톤 (talk | contribs) (Created page with "Pruning is a technique used in decision trees and machine learning to reduce the complexity of a model by removing sections of the tree that provide little predictive power. The primary goal of pruning is to prevent overfitting, ensuring that the model generalizes well to unseen data. Pruning is widely used in decision trees and ensemble methods, such as random forests, to create simpler, more interpretable models. ==Types of Pruning== There are two main types of pruning...") Tag: Visual edit
  • 17:05, 4 November 2024N-Fold Cross-Validation (hist | edit) ‎[5,458 bytes]핵톤 (talk | contribs) (Created page with "N-Fold Cross-Validation is a technique used in machine learning to evaluate a model's performance by dividing the dataset into multiple subsets, or "folds." In this method, the dataset is split into N equal parts, where the model is trained on N-1 folds and tested on the remaining fold. This process is repeated N times, each time using a different fold as the test set, and the results are averaged to obtain an overall performance estimate. N-fold cross-validation helps t...") Tag: Visual edit
  • 16:50, 4 November 2024Undersampling (hist | edit) ‎[5,426 bytes]핵톤 (talk | contribs) (Created page with "'''Undersampling is a technique used in data science and machine learning to address class imbalance by reducing the number of samples in the majority class'''. Unlike oversampling, which increases the representation of the minority class, undersampling aims to balance the dataset by removing instances from the majority class. This technique is commonly applied in scenarios where the majority class significantly outnumbers the minority class, such as fraud detection...") Tag: Visual edit
  • 16:47, 4 November 2024Oversampling (hist | edit) ‎[5,524 bytes]핵톤 (talk | contribs) (Created page with "Oversampling is a technique used in data science and machine learning to address class imbalance by increasing the number of samples in the minority class. In classification tasks with imbalanced datasets, oversampling helps to balance the distribution of classes, allowing the model to learn patterns from both majority and minority classes. Oversampling is commonly used in applications such as fraud detection, medical diagnosis, and other areas where certain classes are...") Tag: Visual edit
  • 16:42, 4 November 2024Stratified Sampling (hist | edit) ‎[4,799 bytes]핵톤 (talk | contribs) (Created page with "Stratified Sampling is a sampling technique used to ensure that subsets of data (called “strata”) maintain the same distribution of key characteristics as the original dataset. In data science and machine learning, stratified sampling is often used to create training, validation, and test splits, particularly when dealing with imbalanced datasets. This method ensures that each subset is representative of the entire dataset, improving the model's ability to generalize...") Tag: Visual edit
  • 16:36, 4 November 2024Data Partition (hist | edit) ‎[5,033 bytes]핵톤 (talk | contribs) (Created page with "'''Data Partition is a process in data science and machine learning where a dataset is divided into separate subsets to train, validate, and test a model'''. Data partitioning ensures that the model is evaluated on data it has not seen before, helping prevent overfitting and ensuring that it generalizes well to new data. Common partitions include training, validation, and test sets, each serving a specific purpose in the model development process. ==Types of Data Partiti...") Tag: Visual edit
  • 16:16, 4 November 2024Feature Selection (hist | edit) ‎[5,622 bytes]핵톤 (talk | contribs) (Created page with "'''Feature Selection is a process in machine learning and data science that involves identifying and selecting the most relevant features (or variables) in a dataset to improve model performance, reduce overfitting, and decrease computational cost'''. By removing irrelevant or redundant features, feature selection simplifies the model, enhances interpretability, and often improves accuracy. ==Importance of Feature Selection== Feature selection is a crucial step in the mo...") Tag: Visual edit
  • 16:06, 4 November 2024Information Gain (hist | edit) ‎[4,797 bytes]핵톤 (talk | contribs) (Created page with "Information Gain is a metric used in machine learning to measure the effectiveness of a feature in classifying data. It quantifies the reduction in entropy (impurity) achieved by splitting a dataset based on a particular feature. Information gain is widely used in decision tree algorithms to select the best feature for each node split, maximizing the model’s predictive accuracy. ==Definition of Information Gain== Information gain is defined as the difference in entropy...") Tag: Visual edit
  • 16:04, 4 November 2024Impurity (Data Science) (hist | edit) ‎[4,930 bytes]핵톤 (talk | contribs) (Created page with "In data science, impurity refers to the degree of heterogeneity in a dataset, specifically within a group of data points. Impurity is commonly used in decision trees to measure how "mixed" the classes are within each node or split. A high impurity indicates a mix of different classes, while a low impurity suggests that the data is homogenous or predominantly from a single class. Impurity measures guide the decision tree-building process by helping identify the best featu...") Tag: Visual edit
(newest | oldest) View (newer 50 | ) (20 | 50 | 100 | 250 | 500)