Sherman Visual Lab: Science Studio, Computer Science

_ Sherman Visual Lab: Science Studio, Computer Science

04/26/2025

Saturday

eEducation, eBusiness & eArts

Physics Lab | Computer Science | Math & Quantum Physics | Order |

Computer Science

From Dirac Notation to Probability Bracket Notation,
Information Retrieval (IR) and Artificial Intelligence (AI)

Author: Dr. Xing M (Sherman) Wang

Dirac notation (or Bra-ket notation) is a very powerful and indispensable tool for modern physicists. Unfortunately it is only taught in Quantum Mechanics . I believe it would be great to introduce it in Applied Mathematics (like Linear Algebra ).

On the other hand, while studying probability theories, I felt that it would be very helpful if we had a similar notation to represent or derive probabilistic formulas .

That was why I posted following articles online, in which Dirac Notation was insroduced to IR, and Probability Bracket Notation was proposed and applied to IR and AI.

No mater you agree or disagree with my work, I welcome and appreciate your opinions.

How to email to the author?
Subject line:	About the articles on your web site
Email address:	swang (at) shermanlab (dot) com, or from arxiv.org if you are a member

1: Dirac Notation, Fock Space and Riemann Metric Tensor in IR Models

HTML; PDF: Current (06/21/2011); Archived

Abstract

Using Dirac Notation as a powerful tool, we investigate the three classical Information Retrieval (IR) models and some their extensions. We show that almost all such models can be described by vectors in Occupation Number Representations (ONR) of Fock spaces with various specifications on, e.g., occupation number, inner product or term-term interactions. As an important cases of study, Concep Fock Spacs (CFS) is intruduced for Boolean Model; the basic formulas for Singular Value Decomposition (SVD) of Latent Semantic Indexing (LSI) Model are manipulated in terms of Dirac notation. And, based on SVD, a Riemannian metric tensor is introduced, which not only can be used to calculate the relevance of documents to a query, but also may be used to measure the closeness of documents in data clustering.

2: Probability Bracket Notation, Markov Chains, Stochestic Processes and microscopic probabilistic processes

PDF: Current (09/04/2024); Archived

Abstract

Inspired by the Dirac vector probability notation (VPN), we propose the Probability Bracket Notation (PBN), a new set of symbols defined similarly (but not identically) as in the VPN. Applying the PBN to fundamental definitions and theorems for discrete and continuous random variables, we show that the PBN could play a similar role in the probability space as the VBN in the Hilbert vector. Our system P-kets are identified with the probability vectors in Markov chains (MC). The master equation of homogeneous MC in the Schrodinger pictures can be basis-independent. Our system P-bra is linked to the Doi state function and the Peliti standard bra. Transformed from the Schrodinger picture to the Heisenberg picture, the time dependence of the system P-ket of a homogeneous MC (HMC) is shifted to the observable as a stochastic process. Using the correlations established by the special Wick rotation (SWR), the microscopic probabilistic processes (MPPs) are investigated for single and many-particle systems. The expected occupation number of particles in quantum statistics is reproduced by associating time with temperature (the Wick-Matsubara relation).

3: Induced Hilbert Space, Markov Chain, Diffusion Map and Fock Space in Thermophysics

PDF: Current (04/08/2007); Archived

Abstract

In this article, we continue to explore Probability Bracket Notation (PBN), proposed in our previous article. Using both Dirac vector bracket notation (VBN) and PBN, we define induced Hilbert space and induced sample space, and propose that there exists an equivalence relation between a Hilbert space and a probability sample space constructed from the same base observable(s). Then we investigate Markov transition matrices and their eigenvectors to make diffusion maps with two examples: a simple graph theory example, to serve as a prototype of bidirectional transition operator; a famous text document example in IR literature, to serve as a tutorial of diffusion map in text document space. We notice that, in both examples, the sample space of the Markov chain and the Hilbert space spanned by the eigenvectors of the transition matrix are not equivalent. At the end, we apply our PBN and equivalence proposal to Thermophysics by associating phase space with Hilbert space or Fock space of many-particle systems.

4: Probability Bracket Notation: Term Vector Space, Concept Fock Space and Induced Probabilistic IR Models

PDF: Current (06/21/2011); Archived

Abstract

After a brief introduction to Probability Bracket Notation (PBN) for discrete random variables in time-independent probability spaces, we apply both PBN and Dirac notation to investigate probabilistic modeling for information retrieval (IR). We derive the expressions of relevance of document to query (RDQ) for various probabilistic models, induced by Term Vector Space (TVS) and by Concept Fock Space (CFS). The inference network model (INM) formula is symmetric and can be used to evaluate relevance of document to document (RDD); the CFS-induced models contain ingredients of all three classical IR models. The relevance formulas are tested and compared on different scenarios against a famous textbook example.

5: Probability Bracket Notation, Multivariable Systems and Static Bayesian Networks

PDF: Current (03/08/2025); Archived

Abstract

The Probability Bracket Notation (PBN) is used to analyze multiple discrete random variables in static Bayesian Networks (BN) through probabilistic graphical models. We briefly introduce the definitions of probability distributions in multivariable systems and their presentations using PBN, then explore the well-known student BN. Our analysis includes calculating various joint, marginal, intermediate, and conditional probability distributions, completing homework assignments, examining relationships between variables (dependence, independence, and conditional independence), and disclosing the power of and restrictions on inserting P-identity operators. We also show the reasoning capabilities of the Student BN using bottom-up and top-down approaches, validated by Elvira software. In the last section, we discuss BNs with continuous variables. After reviewing linear Gaussian networks, we introduce a customized Healthcare BN that includes continuous and discrete random variables, incorporates user-specific data, and offers tailored predictions through discrete-display (DD) nodes, serving as proxies for their continuous variable parents. Our investigation demonstrates that the PBN delivers a reliable and efficient approach for managing multiple variables in static Bayesian networks, a crucial aspect of Machine Learning (ML) and Artificial Intelligence (AI).

6: Probability Bracket Notation: Markov Sequence Projector of Visible and Hidden Markov Models in Dynamic Bayesian Networks

PDF: Current (02/19/2025); Archived

Abstract

With the symbolic framework of Probability Bracket Notation (PBN), the Markov Sequence Projector (MSP) is introduced to expand the evolution formula of Homogeneous Markov Chains (HMCs). The well-known weather example, a Visible Markov Model (VMM), illustrates that the full joint probability of a VMM corresponds to a specifically projected Markov state sequence in the expanded evolution formula. In a Hidden Markov Model (HMM), the probability basis (P-basis) of the hidden Markov state sequence and the P-basis of the observation sequence exist in the sequential event space. The full joint probability of an HMM is the product of the (unknown) projected hidden sequence of Markov states and their transformations into the observation P-bases. The Viterbi algorithm is applied to the famous Weather-Stone HMM example to determine the most likely weather-state sequence given the observed stone-state sequence. Our results are verified using the Elvira software package. Using the PBN, we unify the evolution formulas for Markov models like VMMs, HMMs, and factorial HMMs (with discrete time). We briefly investigated the extended HMM, addressing the feedback issue, and the continuous-time VMM and HMM (with discrete or continuous states). All these models are subclasses of Dynamic Bayesian Networks (DBNs) essential for Machine Learning (ML) and Artificial Intelligence (AI).

7: Thematic Clustering and the Dual Representations of Text Objects

PDF: Current (01/02/2017);

Abstract

We introduce Thematic Clustering , a new methodology to discover clusters of a set of text documents and, at the same time, to define the theme of each cluster by using its top frequent keywords. Our procedure is based on the ideal of dual representations (TF rep and Concept rep) of text objects (docs or clusters) in term space. We derive cluster TF reps in initial clustering, use them to reduce term space and then renovate clusters. Our test results on three well-known data sets (Disease, Star and Reuters) are very promising: the formed clusters and their themes almost perfectly match our knowledge about the data sets.

Share this with your friends:

Post your comment via our blogs:

More to come, please visit us again!



Copyright © 2002-2024, Sherman Visual Lab

From Dirac Notation to Probability Bracket Notation, Information Retrieval (IR) and Artificial Intelligence (AI)

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

More to come, please visit us again!

From Dirac Notation to Probability Bracket Notation,
Information Retrieval (IR) and Artificial Intelligence (AI)